Benchmarks

Performance benchmarks and evaluations

Showing 1-5 of 5 projects

CLUEbenchmark/CLUE

CLUE is a comprehensive Chinese language understanding evaluation benchmark with datasets, baselines, pre-trained models, and a leaderboard.

4.2K
Stable
Python
LLM Frameworks
Datasets
PyTorch
#chinese#nlp#bert

baichuan-inc/Baichuan-13B

A large language model developed by Baichuan Intelligent Technology for AI-powered applications and research.

2.9K
Archived
Python
LLM Frameworks
LLM Wrappers & SDKs
Hugging Face
#large-language-model#chinese#gpt-4

zzli2022/Awesome-System2-Reasoning-LLM

Comprehensive repository showcasing the latest advances in System-2 reasoning and LLM-based AI models.

1.3K
Experimental
Python
LLM Frameworks
Agents & Orchestration
Python
#llm#reasoning#system-2

Escheee/TBCF

A benchmark library for evaluating correlation filter-based visual tracking algorithms.

1.1K
Archived
Objective-C
Tracking
Benchmarks
#visual-tracking#correlation-filters#benchmark

chengtan9907/OpenSTL

A comprehensive benchmark for spatio-temporal predictive learning, with a focus on AI-powered weather forecasting and video prediction.

1.1K
Stable
Python
Computer Vision
Predictive Learning
PyTorch
#weather-forecasting#video-prediction#self-supervised-learning

Stay in the loop

Get weekly updates on trending AI coding tools and projects.