Explore Projects

Discover 386 open source projects

Active filters (1):
Search: extractionร—
Clear all

Showing 201-220 of 386 projects

alvarobartt/investpy

A Python library for extracting financial data from Investing.com, a popular investment platform.

1.8K
Archived
Python
API Clients & Testing
Databases
#financial-data#investing#investing-com

HFIProgramming/mikutap

A Mainland China Friendly and independent version of the popular web-based music application Mikutap.

1.8K
Archived
HTML
Animation & Motion
Frontend Frameworks
HTML
#music#animation#frontend

dompdf/php-font-lib

A PHP library for reading, parsing, exporting, and creating subsets of various font file formats.

1.8K
Active
PHP
Backend Frameworks
General Utilities
#font#font-files#truetype

camelot-dev/excalibur

A Python library for extracting tabular data from PDF documents, with a web interface for human-in-the-loop extraction.

1.8K
Archived
Python
Backend Frameworks
ETL & Pipelines
Flask
#pdf#table-extraction#data-processing

schemacrawler/SchemaCrawler

SchemaCrawler is a free database schema discovery and comprehension tool that supports various database management systems.

1.8K
Active
HTML
Databases
Documentation
Java
#database#schema#reverse-engineering

impira/docquery

An easy-to-use Python library for extracting information from documents using state-of-the-art AI models.

1.8K
Archived
Python
Computer Vision
API Frameworks
Python
#document-extraction#ocr#text-processing

robertknight/ocrs

A Rust library and CLI tool for extracting text from images using optical character recognition (OCR).

1.8K
Active
Rust
Computer Vision
CLI Tools
#ocr#computer-vision#machine-learning

DQinYuan/chinese_province_city_area_mapper

A Python module for extracting and mapping Chinese province, city, and district data.

1.8K
Archived
Python
Databases
Backend Frameworks
#chinese-data#geo-mapping#data-extraction

thbar/kiba

A data processing and ETL (Extract, Transform, Load) framework for Ruby developers.

1.8K
Active
Ruby
ETL & Pipelines
API Frameworks
#data#etl#ruby

TeamNewPipe/NewPipeExtractor

A Java library for extracting data from various streaming platforms like YouTube, SoundCloud, and Bandcamp.

1.8K
Active
Java
API Frameworks
Backend Frameworks
#crawler#extractor#scraper

shaoxiongji/knowledge-graphs

A comprehensive collection of research on knowledge graphs, covering various applications and techniques.

1.8K
Archived
JavaScript
Knowledge Graph
Databases
JavaScript
#knowledge-graphs#natural-language-processing#information-retrieval

ianzhao/textshot

A Python tool for extracting text from screenshots using OCR technology.

1.8K
Archived
Python
CLI Tools
Computer Vision
#ocr#screenshot#text-extraction

BishopFox/jsluice

A Go library that extracts URLs, paths, secrets, and other interesting bits from JavaScript code.

1.8K
Archived
Go
CLI Tools
Security Research
#javascript#security#cli

microsoft/Recognizers-Text

Microsoft.Recognizers.Text is a library for recognizing and resolving numbers, units, date/time in multiple languages.

1.8K
Active
C#
NLP
#nlp#entity-extraction#date-time-recognition

neuml/paperai

An AI-powered library for extracting and searching information from scientific and medical papers.

1.7K
Experimental
Python
LLM Wrappers & SDKs
Search
Python
#ai#nlp#search

Roshanson/TextInfoExp

An open-source project for experimenting with natural language processing on the Sogou dataset, including text classification, clustering, word embeddings, sentiment analysis, and relation extraction.

1.7K
Archived
Python
NLP Frameworks
Databases
Python
#nlp#text-processing#text-classification

LastAncientOne/Deep_Learning_Machine_Learning_Stock

A collection of code and resources for analyzing and predicting stock market behavior using deep learning and machine learning.

1.7K
Archived
Jupyter Notebook
Machine Learning
Data Science
Jupyter Notebook
#stock-analysis#stock-prediction#data-science

dbashford/textract

A Node.js module for extracting text from various file formats, including HTML, PDF, documents, and images.

1.7K
Stable
HTML
API Frameworks
CLI Tools
Node
#extract-text#extraction#file-conversion

marin-m/vmlinux-to-elf

A tool to recover a fully analyzable .ELF file from a raw Linux kernel by extracting the kernel symbol table.

1.7K
Active
Python
CLI Tools
API Frameworks
#linux#kernel#reverse-engineering

yobix-ai/extractous

Powerful, fast, and efficient unstructured data extraction library written in Rust with language bindings.

1.7K
Archived
Rust
ETL & Pipelines
ETL & Pipelines
Rust
#data-extraction#unstructured-data#etl
1...1012...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.