Explore Projects

Discover 6 open source projects

Active filters (1):
Search: dataqualityร—
Clear all

Showing 1-6 of 6 projects

great-expectations/great_expectations

A Python library that helps ensure data quality and reliability through data profiling and testing.

11.2K
Active
Python
ETL & Pipelines
#data-quality#data-testing#data-profiling

open-metadata/OpenMetadata

A unified metadata platform for data discovery, data observability, and data governance.

8.8K
Active
TypeScript
Data Catalog
Data Governance
TypeScript
#data-discovery#data-lineage#data-quality

awslabs/deequ

Deequ is a Scala library for defining "unit tests for data" to measure data quality in large datasets.

3.6K
Active
Scala
ETL & Pipelines
Testing
Spark
#data-quality#unit-testing#apache-spark

datafold/data-diff

A Python library for comparing data across databases, supporting various database engines.

3.0K
Archived
Python
Databases
ETL & Pipelines
#data-diffing#data-quality#data-engineering

re-data/re-data

A data quality and observability tool for monitoring and fixing data issues before they become problems.

1.6K
Archived
HTML
ETL & Pipelines
CLI Tools
dbt
#data-quality#data-observability#data-monitoring

zinggAI/zingg

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

1.2K
Active
Java
ETL & Pipelines
ML Ops
#identity-resolution#entity-resolution#data-deduplication

Stay in the loop

Get weekly updates on trending AI coding tools and projects.