Explore Projects

Discover 4 open source projects

Active filters (1):
Search: text-extractionร—
Clear all

Showing 1-4 of 4 projects

kreuzberg-dev/kreuzberg

A polyglot document intelligence framework with a Rust core for extracting text, metadata, and structured information from various file formats.

6.6K
Active
HTML
API Clients & Testing
API Documentation
#document-intelligence#metadata-extraction#pdf-extraction

adbar/trafilatura

Gathers text and metadata from the web using crawling, scraping, and extraction techniques.

5.4K
Stable
Python
React
#web-scraping#text-extraction#metadata-gathering

miso-belica/sumy

A Python module for automatic summarization of text documents and HTML pages.

3.7K
Stable
Python
NLP
Backend Frameworks
Python
#html-extraction#text-summarization#nlp

chrismattmann/tika-python

A Python binding to the Apache Tika REST service, enabling text extraction and parsing in Python.

1.6K
Experimental
Python
API Clients & Testing
Data Processing
Python
#text-extraction#text-processing#data-extraction

Stay in the loop

Get weekly updates on trending AI coding tools and projects.