Showing 221-240 of 382 projects
A collection of research papers and tools related to using machine learning for compiler and system optimization.
This repository contains a dataset of Chinese medical dialogues for NLP and conversational AI research.
A Python library for synthetic data curation and structured data extraction for machine learning models.
A Python library that allows developers to easily draw datasets within their notebooks.
The Pile is a large, diverse language model training dataset for use in AI research and development.
Build, enrich, and transform datasets using AI models with no code
A collection of modular datasets generated by GPT-4 for AI code generation and prompt engineering.
A safe reinforcement learning from human feedback (RLHF) system for aligning large language models with human values.
A cleaned and curated version of the Alpaca dataset from Stanford, useful for machine learning projects.
A curated list of resources for meta-learning, including papers, code, books, and more for developers working with AI tools.
A Python library that uses LLMs and embeddings to process datasets with up to 1000x speedups
A Python library for extracting schema, statistics, and entities from datasets, useful for data profiling and privacy analysis.
A collection of SQL queries to analyze social media datasets.
A PyTorch implementation of the BiDAF network for question-answering on the SQuAD dataset.
A highly performant and flexible React table component for displaying large datasets.
A project that integrates text-to-SQL dataset, solutions, and research papers for developers working with AI tools.
Tool for generating high-quality synthetic datasets
CLUENER2020 is a Chinese fine-grained named entity recognition dataset and benchmark for AI-powered NLP development.
An open-source benchmarking tool for evaluating video generation models.
A large-scale dataset of raw MRI measurements and clinical MRI images for medical imaging research.
Get weekly updates on trending AI coding tools and projects.