Explore Projects

Discover 230 open source projects

Active filters (1):

Search: benchmarks×

Showing 201-220 of 230 projects

Tencent/AICGSecEval

A.S.E (AICGSecEval) is a repository-level AI-generated code security evaluation benchmark developed by Tencent Wukong Code Security Team.

1.1K

Active

Python

LLM Frameworks

API Frameworks

Python

#benchmark#codesecurity#llm

elastic/ember

Elastic Malware Benchmark for Empowering Researchers, a Jupyter Notebook project for malware analysis.

1.1K

Archived

Jupyter Notebook

Computer Vision

API Frameworks

#malware-analysis#computer-vision#jupyter-notebook

SalesforceAIResearch/enterprise-deep-research

An open-source research platform for developing AI-powered enterprise applications using LLMs and multi-agent systems.

1.1K

Active

Python

LLM Frameworks

Agents & Orchestration

React

#llm#multi-agent-systems#enterprise-research

ashfurrow/xcode-hardware-performance

This GitHub repository provides performance benchmarks for running Xcode on various Mac hardware.

1.1K

Archived

IDE Extensions

Hardware Benchmarking

#xcode#performance#benchmark

yangxue0827/RotationDetection

A TensorFlow-based rotation detection benchmark for computer vision and AI models.

1.1K

Archived

Python

Computer Vision

TensorFlow

#computer-vision#ai-models#rotation-detection

sierra-research/tau-bench

Tau-Bench is a Python library for benchmarking and evaluating AI language models and tools.

1.1K

Stable

Python

LLM Frameworks

CLI Tools

#benchmarking#evaluation#language-models

baidu-research/DeepBench

A benchmarking tool for measuring performance of deep learning operations on different hardware.

1.1K

Archived

C++

ML Ops

Benchmarking

#benchmarking#deep-learning#performance-testing

THUDM/LongBench

LongBench is a benchmark for evaluating large language models on long-context tasks.

1.1K

Archived

Python

LLM Frameworks

Benchmark

Python

#benchmark#llm#long-context

THU-LYJ-Lab/T3Bench

A Python benchmark suite for evaluating text-to-3D generation models and techniques.

1.1K

Archived

Python

Computer Vision

Inference

Python

#3d-generation#text-to-3d#diffusion

ray-project/llmperf

A library for validating and benchmarking large language models (LLMs) for developers working with AI tools.

1.1K

Archived

Python

LLM Frameworks

Testing

Python

#llms#benchmarking#validation

doc-analysis/TableBank

TableBank is a benchmark dataset for table detection and recognition, useful for building computer vision models.

1.1K

Archived

Computer Vision

#computer-vision#table-detection#table-recognition

pdebench/PDEBench

An extensive benchmark for scientific machine learning, focused on physics-informed neural networks and partial differential equations.

1.1K

Experimental

Python

ML Ops

API Frameworks

PyTorch

#benchmark#partial-differential-equations#neural-networks

dotnet/crank

A benchmarking infrastructure for .NET applications to measure performance and resource utilization.

1.1K

Active

Testing

#benchmarking#performance-testing#resource-utilization

LiveBench/LiveBench

A challenging, contamination-free benchmark for large language models (LLMs) to evaluate their performance.

1.1K

Active

Python

LLM Frameworks

#llm#benchmark#evaluation

PKU-Alignment/omnisafe

OmniSafe is an infrastructural framework for accelerating safe reinforcement learning research.

1.1K

Experimental

Python

Reinforcement Learning

Constraint Satisfaction Problem

PyTorch

#safe-reinforcement-learning#benchmark#constraint-rl

kimwalisch/primesieve

A high-performance C++ library for generating prime numbers, optimized for modern CPU architectures.

1.1K

Active

C++

API Frameworks

Math & Numeric

#prime-numbers#number-theory#sieve-of-eratosthenes

chengtan9907/OpenSTL

A comprehensive benchmark for spatio-temporal predictive learning, with a focus on AI-powered weather forecasting and video prediction.

1.1K

Stable

Python

Computer Vision

Predictive Learning

PyTorch

#weather-forecasting#video-prediction#self-supervised-learning

mikeroyal/Open-Source-Security-Guide

An open-source security guide covering security standards, frameworks, threat models, encryption, and benchmarks.

1.1K

Experimental

Security Research

Penetration Testing

#security#compliance#penetration-testing

open-mmlab/mmflow

Open-source optical flow toolbox and benchmark for computer vision tasks powered by PyTorch.

1.0K

Archived

Python

Computer Vision

Backend Frameworks

PyTorch

#computer-vision#optical-flow#pytorch

carlini/yet-another-applied-llm-benchmark

A benchmark to evaluate language models on various tasks, useful for vibe coders building AI-powered apps.

1.0K

Experimental

Python

LLM Frameworks

Testing

Python

#language-models#evaluation#benchmark

1...1012

Stay in the loop

Get weekly updates on trending AI coding tools and projects.