AI Coding Benchmarks

Standardized tests for evaluating AI coding capability. Key benchmarks include SWE-bench (real-world bug fixes), HumanEval (code completion), and MBPP (basic programming). Results help developers choose between models and tools.