Showing 1-3 of 3 projects
A research corpus for benchmarking AI systems on abstract reasoning tasks.
A rigorous benchmark for evaluating the code quality and efficiency of large language models like GPT-4.
Automatically generate programs using AI and genetic algorithms, with tutorials and examples.
Get weekly updates on trending AI coding tools and projects.