Explore Projects

Discover 4 open source projects

Active filters (1):
Search: evaluation-frameworkร—
Clear all

Showing 1-4 of 4 projects

confident-ai/deepeval

A Python framework for evaluating and benchmarking large language models (LLMs) and their capabilities.

13.9K
Active
Python
LLM Frameworks
Python
#llm-evaluation#benchmarking#python-framework

EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models, useful for vibe coders working with AI tools.

11.6K
Active
Python
LLM Frameworks
Python
#language-model#evaluation-framework#transformer

promptfoo/promptfoo

A framework for testing and evaluating large language models, prompts, and AI agents for security and performance.

10.8K
Active
TypeScript
LLM Frameworks
TypeScript
#llm-evaluation#prompt-engineering#red-teaming

Kiln-AI/Kiln

Build, Evaluate, and Optimize AI Systems

4.7K
Active
Python
AI Editors/Agents/Copilot
#AI#chain-of-thought#collaboration

Stay in the loop

Get weekly updates on trending AI coding tools and projects.