Explore Projects

Discover 4 open source projects

Active filters (1):
Search: ai-safetyร—
Clear all

Showing 1-4 of 4 projects

jphall663/awesome-machine-learning-interpretability

A curated list of responsible machine learning resources for interpretable AI development.

4.0K
Active
Interpretable AI
Caching
Python
#ai-safety#explainable-ml#fairness

PKU-Alignment/safe-rlhf

A safe reinforcement learning from human feedback (RLHF) system for aligning large language models with human values.

1.6K
Stable
Python
LLM Frameworks
Reinforcement Learning
#ai-safety#large-language-models#reinforcement-learning

OpenLMLab/MOSS-RLHF

A Python library for exploring secrets of RLHF (Reward-Weighted Maximum Likelihood Estimation) in large language models

1.4K
Archived
Python
LLM Frameworks
Agents & Orchestration
Python
#ai-safety#alignment#rlhf

cvs-health/uqlm

A Python package for uncertainty quantification and hallucination detection in large language models (LLMs)

1.1K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
Python
#ai-safety#confidence-estimation#hallucination-detection

Stay in the loop

Get weekly updates on trending AI coding tools and projects.