Showing 1-19 of 19 projects
MLflow is an open-source platform for building, tracking, and deploying AI/ML models with end-to-end observability and evaluation tools.
LLM engineering platform for observability, evaluation, and prompt management
A comprehensive library for debugging, evaluating, and monitoring LLM applications, RAG systems, and agentic workflows.
A Python framework for evaluating and benchmarking large language models (LLMs) and their capabilities.
A framework for testing and evaluating large language models, prompts, and AI agents for security and performance.
AI observability and evaluation tooling for developers building with large language models and AI agents.
The LLM vulnerability scanner, a Python-based tool for identifying security vulnerabilities in large language models.
A comprehensive benchmark for evaluating the capabilities of Chinese large language models (LLMs)
Open-source evaluation and testing library for LLM Agents
A practical guide for LLM engineers, covering fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices.
An open-source LLMOps platform for prompt playground, prompt management, LLM evaluation, and LLM observability.
Laminar is an open-source observability platform purpose-built for AI agents and workflows.
Comprehensive resources for developers working with Generative AI, including projects, use cases, and interview prep.
An AI-powered security toolkit for LLM vulnerability scanning and red teaming.
Build, enrich, and transform datasets using AI models with no code
A powerful tool for automated LLM fuzzing to help developers and security researchers identify and mitigate potential jailbreaks.
Prompty is a Python library that makes it easy to create, manage, debug, and evaluate LLM prompts for AI applications.
A Python package for uncertainty quantification and hallucination detection in large language models (LLMs)
An open-source post-building layer for AI agents, providing environment data and evaluations to power agent post-training and monitoring.
Get weekly updates on trending AI coding tools and projects.