Explore Projects

Discover 386 open source projects

Active filters (1):
Search: extractionร—
Clear all

Showing 181-200 of 386 projects

mozilla/fathom

A framework for extracting meaning from web pages, now deprecated.

2.0K
Stable
JavaScript
Component Libraries (React)
React
#authentication#streaming#real-time

jesolem/PCV

Open source Python module for computer vision tasks like object detection, feature extraction, and more.

2.0K
Archived
Python
Computer Vision
#computer-vision#image-processing#object-detection

RD17/ambar

Ambar is a self-hosted document search engine that uses OCR to extract text from PDFs and other files.

2.0K
Archived
JavaScript
Search-as-a-Service
Search
Node.js
#search#ocr#pdf

loujie0822/DeepIE

DeepIE is a Python library for information extraction using deep learning techniques.

1.9K
Archived
Python
LLM Frameworks
API Frameworks
Python
#information-extraction#deep-learning#nlp

blader/Claudeception

A Claude Code skill for autonomous skill extraction and continuous learning, helping developers build smarter AI tools.

1.9K
Active
Shell
AI Coding Agents
LLM Wrappers & SDKs
#ai#llm#autonomous

gildas-lormeau/SingleFileZ

A web extension to save complete web pages as self-extracting ZIP files for offline access.

1.9K
Stable
JavaScript
Frontend Frameworks
CLI Tools
JavaScript
#archive#webpage#zip

manisandro/gImageReader

A Gtk/Qt front-end to the Tesseract OCR (Optical Character Recognition) engine, allowing document scanning and text extraction.

1.9K
Active
C++
API Frameworks
Databases
#ocr#pdf#scanning

Momo707577045/media-source-extract

MediaSource ่ง†้ข‘ๆๅ–ๆ•™็จ‹ for vibe coders

1.9K
Experimental
HTML
React
#video-extraction#media-source#javascript

pymatting/pymatting

A Python library for alpha matting, which is the process of extracting a foreground object from an image.

1.9K
Active
Python
Computer Vision
#alpha-matting#image-processing#foreground-extraction

scrapy/scrapely

A pure-python HTML screen-scraping library for developers who need to extract data from websites.

1.9K
Archived
HTML
Backend Frameworks
API Frameworks
#web-scraping#data-extraction#html-parsing

extractus/article-extractor

A Node.js library for extracting the main article content from a given URL using the Readability algorithm.

1.9K
Stable
JavaScript
API Frameworks
Backend Frameworks
Node
#article-extraction#web-scraping#readability

NanoNets/docext

An on-premises, OCR-free unstructured data extraction, markdown conversion and benchmarking toolkit.

1.9K
Stable
Python
Computer Vision
API Frameworks
Python
#document-analysis#document-data-extraction#ocr-benchmark

orlyjamie/mimikittenz

A post-exploitation PowerShell tool for extracting info from memory.

1.9K
Archived
PowerShell
PowerShell
#powershell#post-exploitation#memory-extraction

imbue-bit/AlphaGPT

This Python project uses symbolic regression to efficiently extract factors in the Chinese stock and crypto markets.

1.9K
Active
Python
Quant
API Frameworks
Python
#deep-learning#finance#math

Kav-K/GPTDiscord

A robust, all-in-one GPT interface for Discord with chatbot, image generation, moderation, and more

1.8K
Archived
Python
LLM Wrappers & SDKs
Authentication
asyncio
#chatbot#dalle2#discord-bot

allmarkedup/purl

A JavaScript utility for parsing URLs and extracting information from them.

1.8K
Archived
JavaScript
Utilities & Libraries
API Clients & Testing
Node
#url-parsing#url-utilities#http-client

Srinivas11789/PcapXray

PcapXray is a network forensics tool that visualizes packet capture data as a network diagram, enabling device identification and important communication analysis.

1.8K
Archived
Python
Computer Forensics
Cybersecurity
#network-forensics#packet-analysis#network-diagram

INESCTEC/yake

A single-document unsupervised keyword extraction tool focused on AI and machine learning use cases.

1.8K
Stable
Jupyter Notebook
Corpus-Independent Keyword Extraction
#ai#keyword-extraction#unsupervised

MyIntervals/PHP-CSS-Parser

A PHP library for parsing and manipulating CSS files, allowing developers to extract and optimize CSS data.

1.8K
Active
PHP
Backend Frameworks
CLI Tools
#css#parser#optimization

shcherbak-ai/contextgem

A Python library for extracting data and LLM outputs from various document types with ease.

1.8K
Stable
Python
LLM Frameworks
Data Extraction
#llm#data-extraction#document-intelligence
1...911...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.