Explore Projects

Discover 386 open source projects

Active filters (1):
Search: extractionร—
Clear all

Showing 281-300 of 386 projects

mgechev/injection-js

A feature-complete, fast, and well-tested dependency injection library for JavaScript and TypeScript.

1.4K
Stable
TypeScript
API Frameworks
CLI Tools
Angular
#dependency-injection#decorators#typescript

NanoNets/docstrange

An intelligent document parsing tool that extracts and converts data from various document formats to structured data like Markdown, JSON, CSV, and HTML.

1.4K
Stable
Python
LLM Wrappers & SDKs
API Frameworks
Python
#ocr#pdf-parser#document-parsing

louismullie/treat

A natural language processing framework for Ruby, providing tools for text analysis and extraction.

1.4K
Experimental
Ruby
API Frameworks
ORMs & Query Builders
#nlp#text-analysis#ruby

dinubs/jam-api

A library for parsing web pages using CSS query selectors, useful for web scraping and data extraction.

1.4K
Archived
HTML
API Clients & Testing
Frontend Frameworks
Node
#web-scraping#data-extraction#html-parsing

PKUJohnson/OpenData

An open-source financial data extraction tool that allows easy API access to web scrape data from various websites.

1.4K
Archived
Python
ETL & Pipelines
CLI Tools
Python
#web-scraping#data-extraction#financial-data

martinsbalodis/web-scraper-chrome-extension

A Chrome extension for web data extraction and web scraping tasks.

1.4K
Archived
JavaScript
Chrome Extensions
#web-scraping#data-extraction#chrome-extension

felipecsl/wombat

Lightweight Ruby web crawler/scraper with an elegant DSL to extract structured data from pages.

1.4K
Stable
Ruby
Backend Frameworks
API Frameworks
#crawler#scraper#dsl

gtoonstra/etl-with-airflow

This repository provides best practices and examples for building ETL (Extract, Transform, Load) pipelines using Apache Airflow.

1.4K
Archived
Shell
ETL & Pipelines
#etl#airflow#data-pipelines

Unpackerr/unpackerr

Automatically extracts downloads for various media centers and deletes extracted files after import.

1.4K
Active
Go
Golang
#authentication#streaming#real-time

winkjs/wink-nlp

A developer-friendly natural language processing library for building chatbots, extracting entities, and analyzing sentiment.

1.4K
Stable
JavaScript
NLP Frameworks
API Frameworks
Node.js
#natural-language-processing#sentiment-analysis#named-entity-extraction

scrapy/quotesbot

This is a sample Scrapy project for educational purposes, focused on web scraping and data extraction.

1.4K
Archived
Python
Backend Frameworks
Data Extraction & Pipelines
Python
#web-scraping#data-extraction#educational

mvdbos/php-spider

A configurable and extensible PHP web spider for crawling and extracting data from websites.

1.3K
Active
PHP
Backend Frameworks
CLI Tools
#web-scraping#crawling#data-extraction

Cherrison/CrackMinApp

A tool for reverse engineering and extracting the source code of WeChat mini-programs, made with C# and Node.js.

1.3K
Archived
JavaScript
Backend Frameworks
CLI Tools
Node.js
#reverse-engineering#mini-program#wechat

simsong/bulk_extractor

This is a C++ library for bulk data extraction, not focused on AI coding tools or vibe coders.

1.3K
Active
C++
CLI Tools
API Frameworks
#data-extraction#forensics#bulk-processing

panrafal/depthy

Extracts depth map and original images from photos made with Google Camera's Lens Blur.

1.3K
Experimental
JavaScript
Prompt Engineering
Next.js
#image-processing#depth-map#google-camera

Tongjilibo/bert4torch

An elegant PyTorch implementation of popular transformer models like BERT, GPT, and more for NLP tasks.

1.3K
Active
Python
LLM Frameworks
LLM Wrappers & SDKs
PyTorch
#bert#transformers#nlp

unixzii/ibackupextractor

A Rust tool for extracting files from iOS backup archives, useful for developers working with iOS devices.

1.3K
Stable
Rust
API Frameworks
CLI Tools
#apple#backup#ios

thephpleague/color-extractor

A PHP library that extracts colors from images like a human would do.

1.3K
Archived
PHP
Backend Frameworks
General Utilities
#image-processing#color-extraction#php

scrapy/parsel

Parsel is a Python library for extracting data from XML/HTML documents using XPath or CSS selectors.

1.3K
Active
Python
Backend Frameworks
CLI Tools
#scraping#css#xpath

Arvanaghi/SessionGopher

A PowerShell tool that extracts saved session information for remote access tools like WinSCP, PuTTY, and Remote Desktop.

1.3K
Archived
PowerShell
Penetration Testing
CLI Tools
#pentesting#artifacts#registry
1...1416...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.