Explore Projects

Discover 24 open source projects

Active filters (1):
Search: deduplicationร—
Clear all

Showing 21-24 of 24 projects

google-research/deduplicate-text-datasets

A Rust library for deduplicating text datasets, potentially useful for machine learning projects.

1.3K
Archived
Rust
Data & Databases
CLI Tools
#data-deduplication#text-processing#machine-learning

zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

1.2K
Active
Java
ETL & Pipelines
ML Ops
#identity-resolution#entity-resolution#data-deduplication

jborg/attic

Attic is a deduplicating backup program that can be used to securely backup data to remote or local storage.

1.1K
Archived
Python
API Frameworks
Databases
#backup#deduplication#storage

J535D165/recordlinkage

A powerful Python library for record linkage and duplicate detection in data-driven applications.

1.0K
Archived
Python
Data Matching & Deduplication
#data-matching#deduplication#entity-resolution
1

Stay in the loop

Get weekly updates on trending AI coding tools and projects.