Explore Projects

Discover 386 open source projects

Active filters (1):
Search: extractionร—
Clear all

Showing 301-320 of 386 projects

petl-developers/petl

A Python library for extracting, transforming, and loading tabular data.

1.3K
Stable
Python
ETL & Pipelines
Python
#etl#tabular-data#data-pipelines

kevva/download

A Node.js library for downloading and extracting files from the web using promises and streams.

1.3K
Archived
JavaScript
HTTP Clients
CLI Tools
Node.js
#async#download#extract

x2bool/xlite

A Rust library that enables querying Excel spreadsheets using SQLite, making data extraction and analysis more efficient.

1.3K
Experimental
Rust
Databases
CLI Tools
#excel#ods#sql

tavily-ai/tavily-mcp

A production-ready MCP server with real-time search, extract, map, and crawl capabilities for vibe coders.

1.3K
Active
JavaScript
MCP Servers
Search-as-a-Service
React
#mcp#search#extract

sunra/php-simple-html-dom-parser

A PHP library for parsing HTML documents and extracting data, useful for web scraping and automation.

1.3K
Archived
HTML
Backend Frameworks
General Utilities
PHP
#html-parsing#web-scraping#data-extraction

denandz/sourcemapper

Extract JavaScript source trees from Sourcemap files using a Go library.

1.3K
Archived
Go
CLI Tools
API Frameworks
Go
#sourcemaps#javascript#code-analysis

dragnet-org/dragnet

An open-source Python library for accurately extracting the main content from web pages.

1.3K
Experimental
Python
Backend Frameworks
API Clients & Testing
Python
#web-scraping#content-extraction#html-parsing

hsiafan/apk-parser

An APK parser library for Java that allows developers to inspect and extract information from Android application packages.

1.3K
Archived
Java
API Frameworks
Android
#apk#android#parsing

cased/kit

A toolkit for building AI-powered developer tools, featuring codebase mapping, symbol extraction, and advanced code search.

1.3K
Active
Python
AI Code Editors
CLI Tools
Python
#ai-coding-tools#codebase-mapping#symbol-extraction

raznem/parsera

Lightweight Python library for scraping websites using large language models (LLMs) and the Playwright browser automation tool.

1.3K
Stable
Python
LLM Frameworks
Backend Frameworks
Python
#ai#scraping#data-extraction

summanlp/textrank

An implementation of the TextRank algorithm for extracting keywords and summarizing text in Python.

1.3K
Archived
Python
NLP
API Frameworks
Python
#natural-language-processing#text-summarization#keywords-extraction

tinyfish-io/agentql

AgentQL is a suite of tools for connecting your AI to the web, featuring a query language and Playwright integrations for web automation and data extraction.

1.3K
Active
Python
Agents & Orchestration
API Clients & Testing
Playwright
#web-automation#web-scraping#playwright-integration

styleguidist/react-docgen-typescript

A TypeScript parser for extracting React component props from TypeScript source code.

1.3K
Active
TypeScript
Component Libraries (React)
Documentation
React
#typescript#react#props

InstaPy/instagram-profilecrawl

A Python script that quickly crawls and extracts information from Instagram profiles.

1.3K
Archived
Python
Backend Frameworks
CLI Tools
#automation#crawler#instagram

mvdan/xurls

A Go library for extracting URLs from text, useful for building URL-related applications.

1.3K
Stable
Go
API Clients & Testing
Backend Frameworks
#url-extraction#text-parsing#link-detection

Borewit/music-metadata

A TypeScript library for parsing metadata from various audio and video file formats.

1.2K
Active
TypeScript
API Clients & Testing
Audio
TypeScript
#audio#metadata#id3

K0lb3/UnityPy

UnityPy is a Python module that enables extraction, unpacking, and editing of Unity assets.

1.2K
Stable
Python
CLI Tools
API Frameworks
Python
#unity#unity-asset#unity-asset-extractor

tongzx/nt5src

This appears to be a leaked source code repository for Windows XP (NT5), not a developer discovery platform for vibe coders.

1.2K
Archived
Windows
#windows#xp#source-code

Open-Source-Legal/OpenContracts

An enterprise-grade, API-first LLM workspace for unstructured document processing, with features like data extraction, redaction, and prompt engineering.

1.2K
Active
Python
LLM Frameworks
ETL & Pipelines
Python
#llm#prompt-engineering#etl

yuanxiaosc/Entity-Relation-Extraction

An open-source library for entity and relation extraction using TensorFlow and BERT, suitable for NLP tasks.

1.2K
Archived
Python
Computer Vision
API Frameworks
TensorFlow
#natural-language-processing#information-extraction#entity-recognition
1...1517...20

Stay in the loop

Get weekly updates on trending AI coding tools and projects.