Explore Projects

Discover 41 open source projects

Active filters (1):
Search: evalsร—
Clear all

Showing 21-40 of 41 projects

datachain-ai/datachain

Comprehensive analytics, versioning, and ETL toolkit for multimodal data (video, audio, PDFs, images)

2.7K
Active
Python
Computer Vision
ETL & Pipelines
Python
#data-analytics#data-wrangling#embeddings

lmnr-ai/lmnr

Laminar is an open-source observability platform purpose-built for AI agents and workflows.

2.7K
Active
TypeScript
Agents & Orchestration
LLM Observability
TypeScript
#ai#observability#llm

uptrain-ai/uptrain

An open-source platform for evaluating and improving Generative AI applications with 20+ preconfigured checks and root cause analysis.

2.3K
Archived
Python
LLM Frameworks
Testing
Python
#llm-eval#prompt-engineering#root-cause-analysis

tatsu-lab/alpaca_eval

An automatic evaluator for instruction-following language models with human-validated, high-quality, cheap, and fast evaluation.

2.0K
Stable
Jupyter Notebook
LLM Frameworks
Evaluation
Jupyter Notebook
#deep-learning#foundation-models#large-language-models

hkust-nlp/ceval

Official repository for C-Eval, a Chinese evaluation suite for foundation models.

1.8K
Experimental
Python
LLM Frameworks
#llm#evaluation#chinese

justjake/quickjs-emscripten

A TypeScript library that allows safely executing untrusted JavaScript and using async functions synchronously.

1.6K
Experimental
TypeScript
API Frameworks
CLI Tools
React
#javascript#quickjs#wasm

MLGroupJLU/LLM-eval-survey

A survey paper on evaluating large language models (LLMs) for developers building AI-powered applications.

1.6K
Experimental
LLM Frameworks
Tutorials & Courses
#benchmark#evaluation#large-language-models

BytedanceSpeech/seed-tts-eval

This Python repository provides an evaluation framework for text-to-speech models, focusing on enabling vibe coder development with AI tools.

1.5K
Archived
Python
AI Voice & Speech
Testing
Python
#text-to-speech#speech-synthesis#model-evaluation

eqcss/eqcss

EQCSS is a CSS Reprocessor that introduces Element Queries, Scoped CSS, a Parent selector, and responsive JavaScript to all browsers IE8 and up.

1.5K
Archived
HTML
CSS Frameworks
CSS Frameworks
React
#css#container-queries#element-queries

salesforce/CodeTF

A one-stop Transformer library for state-of-the-art code language models and AI-powered code understanding.

1.5K
Experimental
Python
LLM Frameworks
AI Code Generation
Python
#ai4code#transformers#code-generation

GitHamza0206/simba

OpenSource customer service platform with built-in evaluations and monitoring for developers.

1.4K
Active
TypeScript
CMS & Content
Monitoring
TypeScript
#customer-service#evals#knowledge-base

mattpocock/evalite

Evaluate your LLM-powered apps with TypeScript, a library for vibe coders building AI tools.

1.4K
Stable
TypeScript
LLM Frameworks
AI App Builders
TypeScript
#ai#llm#typescript

Maluuba/nlg-eval

Evaluation code for various unsupervised automated metrics for Natural Language Generation.

1.4K
Archived
Python
NLP
API Frameworks
Python
#nlg#evaluation#metrics

albertlatacz/java-repl

A Java REPL (Read Eval Print Loop) that allows developers to interactively run Java code.

1.3K
Archived
Java
CLI Tools
API Frameworks
#java#repl#interactive-coding

silentmatt/expr-eval

A JavaScript library for parsing and evaluating mathematical expressions.

1.3K
Archived
JavaScript
General Utilities
Frontend Frameworks
JavaScript
#math#expressions#parsing

refreshdotdev/web-eval-agent

An autonomous web application evaluation agent powered by MCP and Playwright for vibe coders.

1.2K
Active
Python
MCP Servers
AI Code Editors
React
#debugging#qa#vibe-coding

facebookarchive/phpsh

A read-eval-print-loop (REPL) for PHP, allowing developers to interactively test and experiment with PHP code.

1.1K
Archived
Emacs Lisp
CLI Tools
API Frameworks
#php#repl#interactive

superlinear-ai/raglite

A Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL, useful for AI coding tools.

1.1K
Active
Python
RAG & Vector
Databases
Python
#retrieval-augmented-generation#duckdb#postgresql

AI-QL/tuui

A desktop tool for orchestrating AI models across vendors using the Model Context Protocol (MCP)

1.1K
Active
TypeScript
MCP Frameworks
Agents & Orchestration
React
#ai-integration#mcp#llm-orchestration

prometheus-eval/prometheus-eval

A Python library to evaluate the response of large language models like GPT-4 using Prometheus metrics.

1.1K
Experimental
Python
LLM Frameworks
Testing
Python
#llm#gpt4#evaluation

Stay in the loop

Get weekly updates on trending AI coding tools and projects.