Showing 361-380 of 382 projects
A PyTorch library for building Factorization Machine models for click-through rate prediction tasks.
A repository containing various NLP datasets collected and organized by the owner.
TableBank is a benchmark dataset for table detection and recognition, useful for building computer vision models.
A curated collection of research papers and resources for natural language processing (NLP) practitioners.
A large dataset of Internet domains that can be used for search engine development and research.
An open-source Python library that helps curate better data for large language models (LLMs).
A PyTorch implementation of the TernausNet model for image segmentation, pre-trained on the Kaggle Carvana dataset.
A PyTorch implementation of Prototypical Networks for Few-Shot Learning, a powerful technique for training AI models on small datasets.
A Jupyter Notebook-based library for exploring and analyzing multimedia datasets at scale.
This GitHub repository is a collection of public person re-identification datasets, which are useful for computer vision and AI research.
A high-resolution network (HRNet) model for image classification trained on the ImageNet dataset.
An open-source corpus dataset for chatbots and question-answering systems in the insurance domain.
A data repository for pre-trained NLP models and corpora to use in language processing projects.
This is a dataset of Borg cluster traces from Google, which can be useful for researchers and developers in the field of distributed systems and cloud infrastructure.
Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.
A Python library for semantic and instance segmentation of LiDAR point clouds for autonomous driving.
A Python library for generating synthetic datasets and tools for computer vision applications.
A central, open resource for data and tools related to chain-of-thought reasoning in large language models.
A GitHub language statistics tool that provides insights into programming language usage across GitHub repositories.
A multimodal dataset for emotion recognition in conversation, useful for building conversational AI and chatbots.
Get weekly updates on trending AI coding tools and projects.