Showing 261-280 of 310 projects
A library for mining data from social media platforms like Twitter, Facebook, and Reddit.
A high-performance logical replication extension for PostgreSQL that enables fast, cross-version database replication.
A Python framework for building data pipelines, web crawlers, and quantitative trading applications.
A library for time series analysis on Apache Spark, enabling efficient large-scale time series processing.
A LangChain-based framework for extracting data from various sources using LLMs and APIs.
A scalable, distributed ETL framework for building data lake analytics pipelines.
A React component library for generating CSV files on the fly from data arrays or objects.
A Python library that helps visualize large time series data using the Plotly data visualization library.
LakeSail is a Rust-based computation framework that unifies batch processing, stream processing, and AI workloads.
A Python package for processing earth-observing satellite data with support for common data formats and tools.
Apache XTable is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
A collection of code snippets and tutorials for data science and data analysis in Python.
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
A beginner-friendly Python toolkit for financial data extraction, analysis, and automation.
A high-speed, intelligent web scraper for the Chitai-Gorod book catalog, enabling structured data collection.
Run your dbt Core or dbt Fusion projects as Apache Airflow DAGs and Task Groups with a few lines of code.
A Python library that integrates Scikit-learn into the Apache Spark distributed computing framework.
A comprehensive knowledge hub for data engineering, machine learning, and MLOps tools and practices.
A collection of open data sets and tools for data science and machine learning tasks.
Open-source search and text analytics platform for exploring large document collections with semantic search and NLP
Get weekly updates on trending AI coding tools and projects.