Explore Projects

Discover 8 open source projects

Active filters (1):
Search: document-parserร—
Clear all

Showing 1-8 of 8 projects

infiniflow/ragflow

RAGFlow is an open-source RAG engine with agent capabilities for LLM context layering

74.2K
Active
Python
MCP Servers
Agents & Orchestration
Python
#ragflow#rag-engine#agent-ai

docling-project/docling

Converts documents to AI-ready formats with advanced parsing

55.0K
Active
Python
Computer Vision
CLI Tools
#document-parsing#pdf-converter#ocr

Unstructured-IO/unstructured

Unstructured is an open-source ETL solution for transforming complex documents into structured data for language models.

14.1K
Active
HTML
Document Processing
#document-processing#data-pipelines#natural-language-processing

run-llama/llama_cloud_services

A set of TypeScript-based cloud services and utilities for processing and extracting structured data from various document formats.

4.2K
Active
TypeScript
File Storage
Caching
TypeScript
#document-parsing#pdf-processing#structured-data

Filimoa/open-parse

An improved file parsing library for LLMs, with advanced table detection and layout parsing capabilities.

3.2K
Archived
Python
LLM Frameworks
CLI Tools
Python
#document-parser#document-structure#layout-parsing

deepdoctection/deepdoctection

A Python library for document AI tasks like layout analysis, table detection, and text extraction.

3.1K
Active
Python
Computer Vision
API Frameworks
PyTorch
#document-ai#document-analysis#ocr

opendataloader-project/opendataloader-pdf

Fast local PDF-to-Markdown/JSON converter for RAG pipelines. No GPU needed.

1.8K
Active
Java
RAG Frameworks
RAG & Vector
Java
#pdf-parser#rag-pipeline#markdown-conversion

NanoNets/docstrange

An intelligent document parsing tool that extracts and converts data from various document formats to structured data like Markdown, JSON, CSV, and HTML.

1.4K
Stable
Python
LLM Wrappers & SDKs
API Frameworks
Python
#ocr#pdf-parser#document-parsing

Stay in the loop

Get weekly updates on trending AI coding tools and projects.