Explore Projects

Discover 382 open source projects

Active filters (1):
Search: datasetร—
Clear all

Showing 141-160 of 382 projects

rasterio/rasterio

Rasterio is a Python library for reading and writing geospatial raster datasets, with support for GDAL and Mapbox satellite imagery.

2.5K
Active
Python
Backend Frameworks
Databases
Python
#gis#geospatial#raster

alan-ai/alan-sdk-web

The Alan AI SDK for Web provides a self-coding system for building AI-powered voice and conversational experiences in web apps.

2.4K
Active
LLM Frameworks
AI SDKs & Wrappers
React
#conversational-ai#voice-ai#low-code

lukes/ISO-3166-Countries-with-Regional-Codes

A comprehensive dataset of ISO 3166-1 country codes and their corresponding UN Geoscheme regional codes, ready to use in various formats.

2.4K
Archived
Ruby
Databases
CLI Tools
#countries#iso#region-codes

huggingface/evaluate

Evaluate is a library for easily evaluating machine learning models and datasets.

2.4K
Active
Python
React
#evaluation#machine-learning#model-evaluation

CesiumGS/3d-tiles

Specification for streaming massive heterogeneous 3D geospatial datasets across the web.

2.4K
Active
Batchfile
Frontend Frameworks
Databases
#3d-models#geospatial#gis

github/CodeSearchNet

CodeSearchNet provides datasets, tools, and benchmarks for representation learning of code, enabling AI-powered code discovery.

2.4K
Archived
Jupyter Notebook
Machine Learning on Source Code
Datasets
Jupyter Notebook
#machine-learning#nlp#data-science

FinMind/FinMind

Open-source financial data API providing over 50 datasets including stock prices, exchange rates, and more.

2.4K
Active
Jupyter Notebook
API Frameworks
Databases
#finance#financial-data#opendata

google/youtube-8m

Starter code for working with the YouTube-8M dataset, a large-scale video understanding dataset.

2.4K
Archived
Python
Datasets
Python
#youtube#dataset#video-understanding

TorchIO-project/torchio

TorchIO is a Python library for efficient medical image preprocessing and data augmentation for AI applications.

2.4K
Active
Python
Computer Vision
Databases
PyTorch
#medical-imaging#data-augmentation#computer-vision

mcordts/cityscapesScripts

A collection of scripts and utilities for the Cityscapes Dataset, a popular dataset for computer vision tasks.

2.3K
Stable
Python
Computer Vision
#computer-vision#dataset#utilities

ufoym/imbalanced-dataset-sampler

An imbalanced dataset sampler for PyTorch that oversamples low-frequency classes and undersamples high-frequency ones.

2.3K
Active
Python
ML Ops
Caching
PyTorch
#data-sampling#imbalanced-data#image-classification

google-research-datasets/Objectron

A dataset of annotated 3D object videos for training computer vision and augmented reality models.

2.3K
Archived
Jupyter Notebook
Computer Vision
Datasets
PyTorch
#3d-vision#object-detection#point-cloud

WLiK/LLM4Rec-Awesome-Papers

A curated list of awesome papers and resources for building recommender systems using large language models (LLM).

2.2K
Experimental
LLM Frameworks
Tutorials & Courses
#recommender-system#large-language-model#llm

OpenGVLab/InternVideo

A video foundation model and dataset for multimodal understanding and video understanding tasks.

2.2K
Stable
Python
Computer Vision
Datasets
PyTorch
#video-understanding#multimodal#foundation-models

zoubohao/DenoisingDiffusionProbabilityModel-ddpm-

A simple implementation of the Denoising Diffusion Probability Model (DDPM) for training a U-Net on the CIFAR-10 dataset.

2.2K
Archived
Python
ML Ops
Computer Vision
Python
#ddpm#denoising#computer-vision

Jon-Becker/prediction-market-analysis

Framework for collecting and analyzing prediction market data with comprehensive Polymarket/Kalshi datasets.

2.1K
Active
Python
ETL & Pipelines
Example Projects
Python
#prediction-markets#polymarket#kalshi

PolymathicAI/the_well

Physics simulation datasets for AI training

2.0K
Stable
Jupyter Notebook
AI Editors/Agents/Copilot
#Machine Learning#Physics Simulation#AI Datasets

minio/simdjson-go

A high-performance JSON parsing library for Golang, enabling lightning-fast processing of large JSON datasets.

2.0K
Stable
Go
API Frameworks
JSON, NDJSON
Go
#json-parsing#fast-json#high-performance

zhu-xlab/GlobalBuildingAtlas

GlobalBuildingAtlas is an open global and complete dataset of building polygons, heights and LoD1 3D models.

2.0K
Active
Python
Databases
Computer Vision
Python
#geospatial#buildings#3D-models

alibaba/clusterdata

A dataset of cluster data collected from Alibaba's production clusters for cluster management research.

2.0K
Stable
Jupyter Notebook
Databases
CLI Tools
#dataset#cluster-data#production-workloads
1...79...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.