Explore Projects

Discover 41 open source projects

Active filters (1):

Search: eval×

Clear all

Showing 1-20 of 41 projects

huggingface/pytorch-image-models

A collection of PyTorch image encoders/backbones with training, evaluation, and inference scripts.

36.4K

Active

Python

LLM Frameworks

Full-Stack Frameworks

Next.js

#PyTorch#Image Models#Deep Learning

langfuse/langfuse

LLM engineering platform for observability, evaluation, and prompt management

22.7K

Active

TypeScript

LLM Frameworks

LLM Wrappers & SDKs

LangChain

#llm-observability#llm-evaluation#prompt-management

mastra-ai/mastra

AI-powered app & agent framework for TypeScript

21.7K

Active

TypeScript

MCP Frameworks

Agents & Orchestration

Next.js

#ai-agents#llm-framework#typescript

openai/evals

A framework for evaluating large language models (LLMs) and an open-source registry of benchmarks.

17.9K

Stable

Python

LLM Frameworks

Python

#llm#evaluation#benchmarking

ConardLi/easy-dataset

A powerful JavaScript tool for creating datasets for fine-tuning large language models (LLMs) and retrieval-augmented generation (RAG).

13.5K

Active

JavaScript

LLM Frameworks

JavaScript

#dataset#fine-tuning#llm

promptfoo/promptfoo

A framework for testing and evaluating large language models, prompts, and AI agents for security and performance.

10.8K

Active

TypeScript

LLM Frameworks

TypeScript

#llm-evaluation#prompt-engineering#red-teaming

bobthecow/psysh

A REPL (Read-Eval-Print Loop) for PHP, providing a powerful interactive environment for developers.

9.8K

Active

PHP

CLI Tools

#php#repl#shell

Arize-ai/phoenix

AI observability and evaluation tooling for developers building with large language models and AI agents.

8.8K

Active

Jupyter Notebook

LLM Frameworks

Agents & Orchestration

Jupyter Notebook

#ai-monitoring#ai-observability#llm-evaluation

laravel/tinker

A powerful REPL (Read-Eval-Print Loop) for the Laravel PHP framework.

7.4K

Active

PHP

API Frameworks

Laravel

#repl#php#laravel

allenai/OLMo

Modeling, training, evaluation, and inference code for OLMo, a large language model.

6.3K

Stable

Python

LLM Frameworks

Python

#language-model#llm#ai

prompt-toolkit/ptpython

A feature-rich, interactive Python REPL (Read-Eval-Print Loop) that improves the developer experience.

5.4K

Stable

Python

CLI Tools

Backend Frameworks

Python

#python#repl#interactive

AgentOps-AI/agentops

Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more for vibe coders.

5.3K

Stable

Python

LLM Frameworks

Agents & Orchestration

Python

#ai#agents#cost-tracking

Giskard-AI/giskard-oss

Open-source evaluation and testing library for LLM Agents

5.1K

Active

Python

LLM Frameworks

React

#evaluation#testing#LLM

Kiln-AI/Kiln

Build, Evaluate, and Optimize AI Systems

4.7K

Active

Python

AI Editors/Agents/Copilot

#AI#chain-of-thought#collaboration

openai/simple-evals

Simple-evals is a Python library for running OpenAI's model evaluation scripts.

4.4K

Experimental

Python

LLM Frameworks

API Frameworks

Python

#openai#llm#model-evaluation

pydantic/logfire

An AI observability platform for production LLM and agent systems, built with Python and Pydantic.

4.1K

Active

Python

LLM Frameworks

API Clients & Testing

FastAPI

#ai-observability#llm-observability#agent-observability

PrimeIntellect-ai/verifiers

A Python library for reinforcement learning environments and evaluations targeted at AI-focused developers.

3.9K

Active

Python

Agents & Orchestration

Python

#reinforcement-learning#ai-tools#evaluation

EvolvingLMMs-Lab/lmms-eval

A multimodal evaluation toolkit for assessing AI models across text, image, video, and audio tasks.

3.8K

Active

Python

LLM Frameworks

Agents & Orchestration

Python

#evaluation#multimodal#large-language-models

openai/human-eval

A Python library for evaluating the capabilities of large language models trained on code.

3.2K

Archived

Python

LLM Frameworks

#language-model#code-generation#evaluation

kodjodevf/mangayomi

A free and open-source application for reading manga, novels, and watching animes across multiple platforms.

3.1K

Active

Dart

Component Libraries (Flutter)

Cross-Platform

Flutter

#anime#manga#novel

2 3

Stay in the loop

Get weekly updates on trending AI coding tools and projects.