Explore Projects

Discover 15 open source projects

Active filters (1):
Search: synthetic-dataร—
Clear all

Showing 1-15 of 15 projects

stefan-jansen/machine-learning-for-trading

Code for machine learning-based algorithmic trading strategies and workflows.

16.7K
Archived
Jupyter Notebook
Machine Learning Ops
#finance#investment#trading

datajuicer/data-juicer

A Python library for processing and analyzing data with foundation models and large language models.

6.0K
Active
Python
LLM Frameworks
ETL & Pipelines
Python
#data-processing#data-analysis#foundation-models

lk-geimfari/mimesis

Mimesis is a fast Python library for generating fake data in multiple languages for testing and development purposes.

4.8K
Active
Python
Databases
Testing
#data-generation#fake-data#testing

Kiln-AI/Kiln

Build, Evaluate, and Optimize AI Systems

4.7K
Active
Python
AI Editors/Agents/Copilot
#AI#chain-of-thought#collaboration

sdv-dev/SDV

Generates synthetic tabular data for machine learning and AI applications

3.4K
Active
Python
AI Code Generation
Next.js
#synthetic-data-generation#tabular-data#machine-learning

DLR-RM/BlenderProc

A Python library for generating photorealistic training images using the Blender 3D software.

3.4K
Active
Python
Computer Vision
Backend Frameworks
Python
#computer-vision#3d-graphics#blender

pgmpy/pgmpy

Python library for Causal AI and Bayesian networks

3.2K
Active
Python
React
#causal-inference#bayesian-networks#probabilistic-inference

synthetichealth/synthea

Synthea is an open-source synthetic patient population simulator for generating realistic healthcare data.

3.0K
Stable
Java
Databases
API Frameworks
#healthcare#data-simulation#fhir

hitsz-ids/synthetic-data-generator

A specialized Python framework for generating high-quality structured tabular data for AI and ML applications.

2.4K
Active
Python
Synthetic Data
Databases
Python
#data-generation#tabular-data#machine-learning

bespokelabsai/curator

A Python library for synthetic data curation and structured data extraction for machine learning models.

1.6K
Active
Python
Synthetic Data
LLM Frameworks
Python
#machine-learning#data-generation#data-curation

huggingface/aisheets

Build, enrich, and transform datasets using AI models with no code

1.6K
Stable
TypeScript
LLM Frameworks
AI SDKs & Wrappers
TypeScript
#ai#llms#nocode

GreenmaskIO/greenmask

A Go-based tool for database anonymization and synthetic data generation to help with security, QA, and data masking.

1.6K
Stable
Go
Databases
Testing
#database#anonymization#synthetic-data

shuttle-hq/synth

Synth is a Rust library for generating realistic, randomized test data for applications and databases.

1.5K
Archived
Rust
Databases
Testing
#data-generation#postgres#realistic-data

plurai-ai/intellagent

A Python framework for comprehensive diagnosis and optimization of AI agents using simulated, realistic synthetic interactions.

1.2K
Stable
Python
Agents & Orchestration
Simulator
#agent-evaluation#agent-optimization#llmops

datadreamer-dev/DataDreamer

DataDreamer is a Python library for generating synthetic data, fine-tuning and aligning large language models.

1.1K
Experimental
Python
LLM Frameworks
Fine-tuning
PyTorch
#llms#gpt#instruction-tuning

Stay in the loop

Get weekly updates on trending AI coding tools and projects.