Showing 1-4 of 4 projects
A polyglot document intelligence framework with a Rust core for extracting text, metadata, and structured information from various file formats.
Gathers text and metadata from the web using crawling, scraping, and extraction techniques.
A Python module for automatic summarization of text documents and HTML pages.
A Python binding to the Apache Tika REST service, enabling text extraction and parsing in Python.
Get weekly updates on trending AI coding tools and projects.