Explore Projects

Discover 255 open source projects

Active filters (1):
Search: scrapingร—
Clear all

Showing 121-140 of 255 projects

botswin/BotBrowser

Advanced web scraping and bot prevention library for building privacy-focused web applications.

2.2K
Active
TypeScript
API Frameworks
Privacy Tools
Puppeteer
#web-scraping#bot-detection#anti-captcha

AnotiaWang/deep-research-web-ui

An AI-powered research assistant that performs deep research on any topic by combining search, scraping, and language models.

2.2K
Stable
TypeScript
LLM Frameworks
Search-as-a-Service
Nuxt
#ai#research#search

brightdata/brightdata-mcp

A powerful MCP server that provides an all-in-one solution for public web access and data extraction.

2.2K
Active
JavaScript
MCP Servers
Backend Frameworks
Node.js
#mcp#web-scraping#data-extraction

sarperavci/CloudflareBypassForScraping

A Python script that bypasses Cloudflare's verification to enable web scraping.

2.1K
Active
Python
Backend & APIs
CLI Tools
#bypass#cloudflare#webscraping

php-embed/Embed

A PHP library that allows developers to easily extract metadata from any web service or page.

2.1K
Stable
PHP
API Clients & Testing
Backend Frameworks
#oembed#opengraph#scraping

simplecrawler/simplecrawler

Flexible event-driven web crawler for Node.js, useful for building custom web scraping solutions.

2.1K
Archived
JavaScript
Backend Frameworks
CLI Tools
Node.js
#web-scraping#crawling#http-client

minimaxir/facebook-page-post-scraper

A Python scraper for extracting data from Facebook Page posts for statistical analysis.

2.1K
Archived
Python
API Clients & Testing
Backend Frameworks
Python
#facebook#scraper#data-analysis

adryfish/fingerprint-chromium

An open-source Chromium-based browser that provides fingerprinting protection and privacy features for web scraping and anti-detection use cases.

2.1K
Stable
Privacy Tools
Frontend Frameworks
Chromium
#anti-bot#anti-detection#privacy

ReaJason/xhs

A Python-based library for scraping data from the Xiaohongshu (Little Red Book) e-commerce platform.

2.0K
Experimental
Python
Backend Frameworks
Web Crawlers
#web-scraping#e-commerce#data-extraction

apify/fingerprint-suite

Browser fingerprinting tools for anonymizing scrapers

2.0K
Active
TypeScript
Playwright
#fingerprinting#anonymizing#scrapers

jimmc414/onefilellm

A tool that makes it easy to scrape and ingest content from various sources like GitHub, arXiv, and YouTube for use with large language models.

1.9K
Stable
Python
LLM Frameworks
CLI Tools
Python
#llm#text-extraction#data-ingestion

scrapy/scrapely

A pure-python HTML screen-scraping library for developers who need to extract data from websites.

1.9K
Archived
HTML
Backend Frameworks
API Frameworks
#web-scraping#data-extraction#html-parsing

trevorhobenshield/twitter-api-client

Python library for interacting with Twitter's APIs, including v1, v2, and GraphQL.

1.9K
Archived
Python
API Clients & Testing
API Frameworks
#twitter#api#scrape

A9T9/RPA

Open-source RPA software with computer vision, OCR, and integration with Anthropic's AI language model.

1.9K
Experimental
JavaScript
AI Coding Agents
MCP Frameworks
Selenium
#browser-automation#computer-vision#ocr

NateScarlet/holiday-cn

A Python tool for automatically scraping data on China's statutory holidays from government announcements.

1.8K
Active
Python
Web Scraping & Crawling
API Frameworks
Python
#china#holidays#data-scraping

xianhu/PSpider

A simple and easy-to-use Python web scraping framework with support for multi-threading and proxies.

1.8K
Archived
Python
Backend & APIs
CLI Tools
Python
#crawler#web-scraper#multi-threading

larymak/Python-project-Scripts

A collection of beginner-friendly Python scripts and projects for developers interested in web scraping, data processing, and more.

1.8K
Experimental
Jupyter Notebook
Backend Frameworks
CLI Tools
Python
#python-script#web-scraping#data-processing

Jieyab89/OSINT-Cheat-sheet

A comprehensive OSINT cheat sheet and repository of tools, datasets, and resources for security researchers and hackers.

1.8K
Active
HTML
OSINT
Cheatsheets
#osint#cybersecurity#hacking

iiab/iiab

An offline-first community library platform built on a Raspberry Pi for international development and education.

1.8K
Active
Jinja
API Frameworks
Full-Stack Frameworks
Jinja
#civic-tech#community-networks#education

alex/nyt-2020-election-scraper

This is a tool for scraping election data from the New York Times website.

1.8K
Archived
HTML
Frontend Frameworks
API Frameworks
HTML
#web-scraping#election-data#new-york-times
1...68...13

Stay in the loop

Get weekly updates on trending AI coding tools and projects.