Explore Projects

Discover 27 open source projects

Active filters (1):
Search: parquetร—
Clear all

Showing 1-20 of 27 projects

sinaptik-ai/pandas-ai

Conversational data analysis with LLMs using natural language queries on databases, CSVs, and data lakes.

23.3K
Stable
Python
Agents & Orchestration
RAG Frameworks
Python
#llm#rag#data-analysis

questdb/questdb

QuestDB is a high-performance, open-source, time-series database for real-time analytics and financial applications.

16.7K
Active
Java
Databases
#time-series#real-time-analytics#financial-data

apache/arrow

Apache Arrow is a fast columnar data format and toolset for in-memory analytics and data interchange.

16.6K
Active
C++
Databases
#data-format#columnar-data#in-memory-analytics

lance-format/lance

An open-source data format for building high-performance multimodal AI applications with fast random access, vector indexing, and data versioning.

6.1K
Active
Rust
LLM Frameworks
Databases
Rust
#data-format#data-versioning#vector-index

Eventual-Inc/Daft

High-performance data engine for AI and multimodal workloads, processing images, audio, video, and structured data at scale.

5.3K
Active
Rust
ML Ops
ETL & Pipelines
Rust
#ai-engineering#data-engineering#distributed

multiprocessio/dsq

A command-line tool for running SQL queries against various data formats like JSON, CSV, Excel, and Parquet.

3.9K
Archived
Go
CLI Tools
Databases
Go
#sql#json#csv

dathere/qsv

Blazing-fast data wrangling toolkit for AI and data engineering workflows

3.5K
Active
Rust
ETL & Pipelines
Databases
#data-engineering#data-wrangling#etl

antonycourtney/tad

A desktop application for viewing and analyzing tabular data, with support for CSV, Parquet, and DuckDB.

3.4K
Experimental
TypeScript
Databases
Caching
TypeScript
#data-analysis#data-science#pivot-tables

roapi/roapi

A Rust-based library to create full-fledged APIs for slowly moving datasets without writing code.

3.4K
Stable
Rust
API Frameworks
Databases
#analytics#column-store#data-lake

apache/arrow-rs

Official Rust implementation of the Apache Arrow data format for efficient data processing and storage.

3.4K
Active
Rust
Databases
CLI Tools
Rust
#arrow#parquet#data-processing

shshemi/tabiew

A lightweight Rust-based TUI application to view and query tabular data files like CSV, TSV, and Parquet.

2.8K
Active
Rust
CLI Tools
Databases
#tui#tabular-data#csv

rilldata/rill

Rill is a tool for transforming data sets into powerful dashboards using SQL, enabling BI-as-code.

2.5K
Active
Go
Databases
ETL & Pipelines
#data-analysis#data-visualization#sql

apache/parquet-format

Apache Parquet Format, a columnar data storage format used in the Apache Hadoop ecosystem.

2.3K
Active
Thrift
Databases
#apache#parquet#columnar-storage

Mooncake-Labs/pg_mooncake

A Rust-based library that provides real-time analytics on Postgres tables, supporting features like columnstore, delta-lake, and Iceberg.

1.9K
Stable
Rust
API Frameworks
Databases
#analytics#columnstore#delta-lake

uber/petastorm

Petastorm enables training and evaluation of deep learning models from Apache Parquet datasets.

1.9K
Active
Python
ML Ops
Databases
PyTorch
#deep-learning#machine-learning#data-processing

gchq/Gaffer

A large-scale entity and relation database supporting aggregation of properties for big data applications.

1.8K
Experimental
Java
Databases
API Frameworks
#big-data#graph-database#hadoop

getml/reflect-cpp

A C++20 library for fast serialization, deserialization and validation using reflection, supporting multiple data formats.

1.8K
Active
C++
API Frameworks
ORMs & Query Builders
#serialization#deserialization#validation

tansu-io/tansu

Apache Kafka-compatible broker with support for S3, PostgreSQL, SQLite, Apache Iceberg, and Delta Lake.

1.6K
Active
Rust
API Frameworks
Databases
#apache-kafka#s3#postgresql

paradigmxyz/cryo

cryo is a Rust library for extracting blockchain data to parquet, CSV, JSON, or Python dataframes.

1.5K
Archived
Rust
ETL & Pipelines
API Frameworks
#blockchain#ethereum#parquet

polarsignals/frostdb

A fast, embeddable column database written in Go, optimized for AI/ML workloads.

1.5K
Active
Go
Databases
ML Ops
#columnar-storage#apache-arrow#apache-parquet
2

Stay in the loop

Get weekly updates on trending AI coding tools and projects.