Explore Projects

Discover 385 open source projects

Active filters (1):
Search: extractร—
Clear all

Showing 61-80 of 385 projects

benhmoore/Knwl

A JavaScript library for parsing text to extract dates, places, times, and other information.

5.3K
Archived
JavaScript
API Clients & Testing
Date & Time
Node.js
#text-parsing#date-extraction#location-extraction

miaomiaosoft/PandaOCR

A multi-purpose OCR tool for image-to-text, translation, read-aloud, formula/table extraction, and more.

5.3K
Archived
Computer Vision
#ocr#image-processing#translation

wanasit/chrono

A natural language date parser in JavaScript that can extract dates from text.

5.2K
Active
TypeScript
API Clients & Testing
Date & Time
TypeScript
#date-parsing#natural-language#time-extraction

google/gumbo-parser

A pure C99 library for parsing HTML5 documents, useful for web scraping and content extraction projects.

5.2K
Active
HTML
Frontend Frameworks
API Frameworks
React
#html-parsing#web-scraping#content-extraction

katanaml/sparrow

Structured data extraction and instruction calling with ML, LLM and Vision LLM for AI-powered developers.

5.1K
Active
Python
LLM Frameworks
Computer Vision
Python
#computer-vision#gpt#huggingface-transformers

dlt-hub/dlt

An open-source Python library that simplifies the process of loading data into data lakes and warehouses.

5.0K
Active
Python
ETL & Pipelines
CLI Tools
Python
#data-engineering#data-loading#data-pipelines

jaypyles/Scraperr

Scraperr is a self-hosted web scraper built with TypeScript, Docker, and Kubernetes for efficient, scalable data extraction.

4.9K
Stable
TypeScript
API Frameworks
CLI Tools
TypeScript
#web-scraping#self-hosted#kubernetes

BoltzmannEntropy/interviews.ai

This repository provides interview preparation resources for AI and machine learning developers.

4.8K
Stable
Interview Preparation
Tutorials & Courses
PyTorch
#machine-learning#deep-learning#data-science

open-mmlab/mmocr

An open-source toolbox for text detection, recognition, and understanding tasks powered by PyTorch.

4.7K
Archived
Python
Computer Vision
API Frameworks
PyTorch
#ocr#text-detection#text-recognition

grobidOrg/grobid

ML library for extracting metadata & text from PDFs using CRF & deep learning

4.7K
Active
Java
Computer Vision
ML Ops
Java
#pdf-extraction#machine-learning#deep-learning

charles2gan/GDA-android-reversing-Tool

A powerful Android decompiler tool for malware analysis, vulnerability detection, and code reversing.

4.7K
Archived
Java
Security Research
API Frameworks
Java
#android-decompiler#malware-analysis#vulnerability-detection

webpack/mini-css-extract-plugin

A lightweight CSS extraction plugin for Webpack, helping developers optimize CSS loading in web applications.

4.7K
Active
JavaScript
Component Libraries (React)
Frontend Frameworks
React
#webpack-plugin#css-extraction#performance-optimization

bjesus/pipet

A versatile tool for web scraping and data extraction, designed for hackers and power users.

4.7K
Archived
Go
CLI Tools
Backend Frameworks
#web-scraping#data-extraction#cli-tool

mholt/archiver

A comprehensive Go library for working with various archive/compression formats.

4.5K
Archived
Go
CLI Tools
General Utilities
#archives#compression#decompression

blueimp/JavaScript-Load-Image

A JavaScript library for loading and processing images, with support for file metadata parsing and image manipulation.

4.5K
Archived
JavaScript
Frontend Frameworks
General Utilities
React
#image-processing#file-metadata#image-manipulation

deanmalmgren/textract

A Python library that provides a simple and unified interface for extracting text from any document format.

4.5K
Archived
HTML
ETL & Pipelines
CLI Tools
Python
#text-extraction#pdf#docx

thunlp/OpenNRE

An open-source Python library for neural relation extraction, a key task in natural language processing.

4.4K
Archived
Python
LLM Frameworks
API Frameworks
Python
#relation-extraction#natural-language-processing#information-extraction

varunshenoy/GraphGPT

A library for extracting knowledge graphs from unstructured text using the GPT-3 language model.

4.4K
Archived
JavaScript
LLM Frameworks
GraphQL
Node
#gpt-3#knowledge-graph#natural-language-processing

qq547276542/Agriculture_KnowledgeGraph

An open-source knowledge graph for the agriculture domain, enabling information retrieval, named entity recognition, relation extraction, and question answering.

4.3K
Experimental
Python
LLM Frameworks
Knowledge Graphs
Python
#knowledge-graph#named-entity-recognition#relation-extraction

zjunlp/DeepKE

An open toolkit for knowledge graph extraction and construction in Python.

4.3K
Experimental
Python
LLM Frameworks
Knowledge Graphs
PyTorch
#knowledge-graph#information-extraction#natural-language-processing
1...35...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.