Explore Projects

Discover 222 open source projects

Active filters (1):
Search: evaluatorร—
Clear all

Showing 101-120 of 222 projects

Xnhyacinth/Awesome-LLM-Long-Context-Modeling

A curated collection of must-read papers and blogs on large language model-based long-context modeling.

1.9K
Active
LLM Frameworks
Tutorials & Courses
#large-language-models#long-context-modeling#llm

szilard/benchm-ml

A benchmarking tool for evaluating the performance of popular machine learning algorithms and libraries.

1.9K
Archived
R
ML Ops
Databases
R
#machine-learning#benchmark#performance-testing

uber/petastorm

Petastorm enables training and evaluation of deep learning models from Apache Parquet datasets.

1.9K
Active
Python
ML Ops
Databases
PyTorch
#deep-learning#machine-learning#data-processing

Tiiiger/bert_score

A Python library for computing BERT-based text generation evaluation metrics.

1.9K
Archived
Jupyter Notebook
Natural Language Processing
Python
#machine-learning#nlp#text-generation

snap-stanford/GraphGym

A platform for designing and evaluating Graph Neural Networks (GNNs) using Python.

1.9K
Archived
Python
ML Ops
Databases
#graph-neural-networks#machine-learning#data-science

codeplea/tinyexpr

A lightweight C library for parsing, compiling, and evaluating mathematical expressions.

1.9K
Stable
C
General Utilities
CLI Tools
#math-expressions#parser#compiler

genieincodebottle/generative-ai

Comprehensive resources for developers working with Generative AI, including projects, use cases, and interview prep.

1.9K
Active
Jupyter Notebook
LLM Frameworks
Tutorials & Courses
Jupyter Notebook
#generative-ai#llm#ai-coding

rentruewang/koila

A Python library that helps prevent CUDA out of memory errors in PyTorch models with just 1 line of code.

1.8K
Active
Python
ML Ops
API Frameworks
PyTorch
#deep-learning#memory-management#out-of-memory

xinshuoweng/AB3DMOT

An open-source Python implementation for 3D multi-object tracking, with KITTI benchmarking and new evaluation metrics.

1.8K
Archived
Python
Computer Vision
API Frameworks
#3d-object-tracking#computer-vision#robotics

hkust-nlp/ceval

Official repository for C-Eval, a Chinese evaluation suite for foundation models.

1.8K
Experimental
Python
LLM Frameworks
#llm#evaluation#chinese

microsoft/CodeXGLUE

CodeXGLUE is a comprehensive benchmark for evaluating the performance of large language models on a variety of coding-related tasks.

1.8K
Archived
C#
LLM Frameworks
Testing
None
#benchmark#large-language-models#code-generation

UKGovernmentBEIS/inspect_ai

A framework for evaluating large language models, focused on AI and machine learning tools.

1.8K
Active
Python
LLM Frameworks
CLI Tools
Python
#large-language-model#evaluation#benchmarking

TEAMMATES/teammates

A feedback management tool for education, built with Java and Angular, used by universities and teachers.

1.8K
Active
Java
Frontend Frameworks
API Frameworks
Angular
#educators#feedback-systems#google-cloud

msoedov/agentic_security

An AI-powered security toolkit for LLM vulnerability scanning and red teaming.

1.8K
Active
Python
LLM Frameworks
Security Research
Python
#llm-security#llm-vulnerability-scanner#llm-fuzzing

ChineseGLUE/ChineseGLUE

A benchmark for evaluating language understanding models and datasets for the Chinese language.

1.8K
Archived
Python
LLM Frameworks
Datasets
Python
#nlp#benchmarking#language-understanding

cisagov/cset

A security audit tool to assess and improve cybersecurity posture.

1.8K
Active
TSQL
Security Research
CLI Tools
#security-audit#cybersecurity#assessment

pyrochlore/obsidian-tracker

A plugin that tracks occurrences and numbers in your Obsidian notes, providing charts and quantified self-tracking.

1.8K
Active
TypeScript
Charts & Visualization
IDE Extensions
Obsidian
#obsidian#quantified-self#note-taking

mlcommons/training

Reference implementations of MLPerfยฎ training benchmarks for evaluating machine learning performance.

1.7K
Stable
Python
ML Ops
API Frameworks
Python
#benchmark#machine-learning#performance-evaluation

029danio/fly

VPN/proxy node recommendation and evaluation guide for airport services

1.7K
Active
HTML
Privacy Tools
Uncategorized
HTML
#vpn#proxy#airport-nodes

alexmojaki/birdseye

A graphical Python debugger that lets you easily view the values of all evaluated expressions.

1.7K
Active
JavaScript
Debugger
API Frameworks
#python-debugging#ast#visualization
1...57...12

Stay in the loop

Get weekly updates on trending AI coding tools and projects.