THUDM/AgentBench

A comprehensive benchmark to evaluate large language models (LLMs) as agents for various tasks.

Python
AI & Machine Learning
LLM Frameworks
Apache-2.0

3.2K

Stars

240

Forks

Jul 28, 2023

Created

Feb 8, 2026

Last Updated

Project Analytics

Stars Growth (1 Month)

+56

+1.8% change

Avg Daily Growth (1 Month)

+2.0

stars per day

Fork/Star Ratio (All Time)

7.5%

Normal engagement

Lifetime Growth

3.4

stars/day over 953 days

Stars Over Time

Forks Over Time

Open Issues Over Time

Pull Requests Over Time

Commits Over Time

AI-Generated Tags

chatgpt
gpt-4
llm
llm-agent
benchmark
evaluation
agent-based-systems

Comments (0)

Sign in to leave a comment or vote

Sign In

No comments yet. Be the first to comment!

Stay in the loop

Get weekly updates on trending AI coding tools and projects.