Showing 61-80 of 230 projects
An open-source cookbook for getting started with Phi, a family of high-performance small language models from Microsoft.
XcodeBenchmark measures the compilation time of a large codebase on different Mac devices to help developers optimize their build times.
This repository provides best practices for segmenting corporate networks for improved security.
An open-source toolbox for implementing state-of-the-art object detection models like YOLOv5, YOLOv6, YOLOv7, and RTMDet.
minitest provides a complete suite of testing facilities supporting TDD, BDD, and benchmarking for Ruby developers.
A collaborative benchmark for measuring and extrapolating the capabilities of language models.
A comprehensive benchmark to evaluate large language models (LLMs) as agents for various tasks.
MTEB is a benchmark for evaluating and comparing text embedding models across multiple tasks and languages.
A comprehensive AI Red Teaming platform for security researchers and developers.
An open-source, cross-platform automated testing and benchmarking software for developers.
Provides data, models, and evaluation benchmark for large language models.
A large language model developed by Baichuan Intelligent Technology for AI-powered applications and research.
A collection of prime number projects in over 100 programming languages to compare their speed and cleverness.
A set of language benchmarks to compare performance across different programming languages.
A simple and lightweight web benchmarking tool for Linux, useful for testing website performance under load.
A unified evaluation framework for large language models, focused on prompt engineering and model robustness.
This repository provides a benchmark for evaluating the complex reasoning ability of large language models using chain-of-thought prompting.
Easy benchmarking of publicly available convolutional neural network implementations.
A benchmark for multimodal AI agents to tackle open-ended tasks in real computer environments.
An open-source GPU-accelerated robotics simulator and benchmark for manipulation skill learning.
Get weekly updates on trending AI coding tools and projects.