Showing 21-40 of 41 projects
Comprehensive analytics, versioning, and ETL toolkit for multimodal data (video, audio, PDFs, images)
Laminar is an open-source observability platform purpose-built for AI agents and workflows.
An open-source platform for evaluating and improving Generative AI applications with 20+ preconfigured checks and root cause analysis.
An automatic evaluator for instruction-following language models with human-validated, high-quality, cheap, and fast evaluation.
Official repository for C-Eval, a Chinese evaluation suite for foundation models.
A TypeScript library that allows safely executing untrusted JavaScript and using async functions synchronously.
A survey paper on evaluating large language models (LLMs) for developers building AI-powered applications.
This Python repository provides an evaluation framework for text-to-speech models, focusing on enabling vibe coder development with AI tools.
EQCSS is a CSS Reprocessor that introduces Element Queries, Scoped CSS, a Parent selector, and responsive JavaScript to all browsers IE8 and up.
A one-stop Transformer library for state-of-the-art code language models and AI-powered code understanding.
OpenSource customer service platform with built-in evaluations and monitoring for developers.
Evaluate your LLM-powered apps with TypeScript, a library for vibe coders building AI tools.
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
A Java REPL (Read Eval Print Loop) that allows developers to interactively run Java code.
A JavaScript library for parsing and evaluating mathematical expressions.
An autonomous web application evaluation agent powered by MCP and Playwright for vibe coders.
A read-eval-print-loop (REPL) for PHP, allowing developers to interactively test and experiment with PHP code.
A Python toolkit for Retrieval-Augmented Generation (RAG) with DuckDB or PostgreSQL, useful for AI coding tools.
A desktop tool for orchestrating AI models across vendors using the Model Context Protocol (MCP)
A Python library to evaluate the response of large language models like GPT-4 using Prometheus metrics.
Get weekly updates on trending AI coding tools and projects.