Benchmarks

Performance benchmarks and evaluations

Showing 1-5 of 5 projects

CLUE is a comprehensive Chinese language understanding evaluation benchmark with datasets, baselines, pre-trained models, and a leaderboard.

4.2K

Stable

Python

LLM Frameworks

Datasets

PyTorch

#chinese#nlp#bert

A large language model developed by Baichuan Intelligent Technology for AI-powered applications and research.

2.9K

Archived

Python

LLM Frameworks

LLM Wrappers & SDKs

Hugging Face

#large-language-model#chinese#gpt-4

Comprehensive repository showcasing the latest advances in System-2 reasoning and LLM-based AI models.

1.3K

Experimental

Python

LLM Frameworks

Agents & Orchestration

Python

#llm#reasoning#system-2

A benchmark library for evaluating correlation filter-based visual tracking algorithms.

1.1K

Archived

Objective-C

Tracking

Benchmarks

#visual-tracking#correlation-filters#benchmark

A comprehensive benchmark for spatio-temporal predictive learning, with a focus on AI-powered weather forecasting and video prediction.

1.1K

Stable

Python

Computer Vision

Predictive Learning

PyTorch

#weather-forecasting#video-prediction#self-supervised-learning

Get weekly updates on trending AI coding tools and projects.