Explore Projects

Discover 386 open source projects

Active filters (1):
Search: extractร—
Clear all

Showing 121-140 of 386 projects

automeris-io/WebPlotDigitizer

A computer vision-assisted tool to extract numerical data from plot images for data visualization and analysis.

3.0K
Active
JavaScript
Charts & Visualization
Data Mining
JavaScript
#charts#data-extraction#computer-vision

morkt/GARbro

A visual novels resource browser and reverse-engineering tool for extracting audio, images, and other assets.

3.0K
Archived
C#
Component Libraries (C#)
API Frameworks
#visual-novel#extraction#reverse-engineering

CatchTheTornado/text-extract-api

An API for extracting, anonymizing, and parsing text from various document formats using state-of-the-art OCR and LLM models.

3.0K
Stable
Python
LLM Wrappers & SDKs
API Clients & Testing
Python
#anonymization#ocr#pdf

mit-nlp/MITIE

MITIE is a C++ library and tools for information extraction, with support for Java and Python.

3.0K
Stable
C++
Natural Language Processing
#information-extraction#natural-language-processing#machine-learning

apache/incubator-devlake

An open-source dev data platform to ingest, analyze, and visualize data from DevOps tools for engineering insights.

2.9K
Active
Go
ETL & Pipelines
CLI Tools
Go
#devops#data-analysis#data-engineering

Threezh1/JSFinder

JSFinder is a Python tool that quickly extracts URLs and subdomains from JavaScript files on a website.

2.9K
Archived
Python
API Clients & Testing
Frontend Frameworks
#web-scraping#url-extraction#subdomain-discovery

urchade/GLiNER

A lightweight and generalist NER model for extracting entities from text, with support for prompt-tuning.

2.9K
Active
Python
LLM Frameworks
Named Entity Recognition
Python
#information-extraction#named-entity-recognition#natural-language-processing

topfunky/hpple

An Objective-C XML/HTML parsing library inspired by Hpricot, useful for web scraping and data extraction.

2.9K
Archived
Objective-C
Backend Frameworks
General Utilities
#html-parsing#xml-parsing#web-scraping

microsoft/table-transformer

Deep learning model for extracting & analyzing table structures from PDFs and images with datasets.

2.9K
Archived
Python
Computer Vision
ETL & Pipelines
PyTorch
#table-extraction#computer-vision#document-processing

insidegui/AssetCatalogTinkerer

An app for browsing and extracting images from .car asset catalog files, useful for iOS/macOS developers.

2.8K
Stable
Swift
IDE Extensions
iOS
Swift
#asset-catalog#quicklook#quicklook-plugin

drewnoakes/metadata-extractor

A Java library for extracting metadata from various media file formats, including images, videos, and audio.

2.8K
Experimental
Java
Libraries & Utilities
Backend Frameworks
#metadata#exif#iptc

any4ai/AnyCrawl

AnyCrawl is a Node.js/TypeScript web scraper that extracts structured data from search engines and websites for use in AI/LLM applications.

2.8K
Active
TypeScript
LLM Wrappers & SDKs
Backend Frameworks
Node.js
#web-scraper#serp#data-extraction

hitchao/Jvedio

Jvedio is a local video management software that supports scanning local videos, building a video library, and using AI to extract video metadata.

2.7K
Archived
C#
API Frameworks
Databases
#video-management#computer-vision#ffmpeg

blmoistawinde/HarvestText

A versatile NLP toolkit for text mining and preprocessing, supporting tasks like sentiment analysis, entity extraction, and keyword summarization.

2.6K
Archived
Python
NLP
CLI Tools
Python
#nlp#text-mining#sentiment-analysis

flashbots/pm

A project that provides information about Flashbots, a platform for building and operating Ethereum MEV-extracting bots.

2.6K
Archived
API Clients & Testing
API Frameworks
Node
#ethereum#mev#bots

4sval/FModel

A C# library for exploring and extracting data from Unreal Engine game archives like Fortnite, GTA, PUBG, and more.

2.6K
Active
C#
CLI Tools
API Frameworks
dotnet
#unreal-engine#game-development#file-explorer

oxylabs/how-to-scrape-amazon-product-data

A Python-based web scraper for extracting Amazon product data like titles, ratings, prices, images, and descriptions.

2.6K
Stable
Backend Frameworks
API Clients & Testing
Python
#amazon#web-scraping#data-extraction

megastep/makeself

A self-extracting archiving tool for Unix systems, built entirely in shell script.

2.6K
Active
Shell
CLI Tools
API Frameworks
#compression#extract-archive#linux

Artikash/Textractor

A powerful text extraction tool for video games and visual novels, highly extensible for developers.

2.5K
Archived
C++
CLI Tools
API Frameworks
#reverse-engineering#games#text-extraction

dynobo/normcap

An OCR-powered screen capture tool to extract information instead of just capturing images.

2.5K
Active
Python
CLI Tools
Computer Vision
Python
#ocr#screenshot#multiplatform
1...68...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.