Explore Projects

Discover 5 open source projects

Active filters (1):
Search: pdf-to-jsonร—
Clear all

Showing 1-5 of 5 projects

docling-project/docling

Converts documents to AI-ready formats with advanced parsing

55.0K
Active
Python
Computer Vision
CLI Tools
#document-parsing#pdf-converter#ocr

Unstructured-IO/unstructured

Unstructured is an open-source ETL solution for transforming complex documents into structured data for language models.

14.1K
Active
HTML
Document Processing
#document-processing#data-pipelines#natural-language-processing

run-llama/llama_cloud_services

A set of TypeScript-based cloud services and utilities for processing and extracting structured data from various document formats.

4.2K
Active
TypeScript
File Storage
Caching
TypeScript
#document-parsing#pdf-processing#structured-data

opendataloader-project/opendataloader-pdf

Fast local PDF-to-Markdown/JSON converter for RAG pipelines. No GPU needed.

1.8K
Active
Java
RAG Frameworks
RAG & Vector
Java
#pdf-parser#rag-pipeline#markdown-conversion

NanoNets/docstrange

An intelligent document parsing tool that extracts and converts data from various document formats to structured data like Markdown, JSON, CSV, and HTML.

1.4K
Stable
Python
LLM Wrappers & SDKs
API Frameworks
Python
#ocr#pdf-parser#document-parsing

Stay in the loop

Get weekly updates on trending AI coding tools and projects.