Explore Projects

Discover 4 open source projects

Active filters (1):

Search: evaluation-framework×

Showing 1-4 of 4 projects

confident-ai/deepeval

A Python framework for evaluating and benchmarking large language models (LLMs) and their capabilities.

13.9K

Active

Python

LLM Frameworks

Python

#llm-evaluation#benchmarking#python-framework

EleutherAI/lm-evaluation-harness

A framework for few-shot evaluation of language models, useful for vibe coders working with AI tools.

11.6K

Active

Python

LLM Frameworks

Python

#language-model#evaluation-framework#transformer

promptfoo/promptfoo

A framework for testing and evaluating large language models, prompts, and AI agents for security and performance.

10.8K

Active

TypeScript

LLM Frameworks

TypeScript

#llm-evaluation#prompt-engineering#red-teaming

Kiln-AI/Kiln

Build, Evaluate, and Optimize AI Systems

4.7K

Active

Python

AI Editors/Agents/Copilot

#AI#chain-of-thought#collaboration

Stay in the loop

Get weekly updates on trending AI coding tools and projects.