Explore Projects

Discover 8 open source projects

Active filters (1):

Search: pdf-parser×

Clear all

Showing 1-8 of 8 projects

PaddlePaddle/PaddleOCR

PaddleOCR converts documents/images to structured data for AI apps

71.6K

Active

Python

Computer Vision

MCP Servers

PaddlePaddle

#ocr#document-parsing#ai4science

opendatalab/MinerU

Converts complex documents into LLM-ready formats for agentic workflows

55.5K

Active

Python

Agents & Orchestration

Agent Coordination

Python

#document-analysis#pdf-extraction#llm-workflows

py-pdf/pypdf

A pure-Python library for manipulating PDF documents, including splitting, merging, cropping, and transforming pages.

9.8K

Active

Python

API Frameworks

#pdf#pdf-manipulation#pdf-parser

bytedance/Dolphin

Dolphin is a document image parsing library that uses heterogeneous anchor prompting for OCR and layout analysis.

8.9K

Stable

Python

Computer Vision

API Frameworks

Python

#document-analysis#layout-analysis#ocr

opendataloader-project/opendataloader-pdf

Fast local PDF-to-Markdown/JSON converter for RAG pipelines. No GPU needed.

1.8K

Active

Java

RAG Frameworks

RAG & Vector

Java

#pdf-parser#rag-pipeline#markdown-conversion

yobix-ai/extractous

Powerful, fast, and efficient unstructured data extraction library written in Rust with language bindings.

1.7K

Archived

Rust

ETL & Pipelines

Rust

#data-extraction#unstructured-data#etl

dromara/yft-design

A powerful, feature-rich online design tool built with Vue3, fabric.js, and Element Plus for creating posters, product images, and more.

1.5K

Stable

TypeScript

Component Libraries (Vue/Svelte)

Frontend Frameworks

Vue.js

#canvas-editor#online-design#online-editor

NanoNets/docstrange

An intelligent document parsing tool that extracts and converts data from various document formats to structured data like Markdown, JSON, CSV, and HTML.

1.4K

Stable

Python

LLM Wrappers & SDKs

API Frameworks

Python

#ocr#pdf-parser#document-parsing

Stay in the loop

Get weekly updates on trending AI coding tools and projects.