Explore Projects

Discover 255 open source projects

Active filters (1):
Search: scraping×
Clear all

Showing 21-40 of 255 projects

FlareSolverr/FlareSolverr

A proxy server that can bypass Cloudflare protection for web scraping and other use cases.

12.8K
Active
Python
API Clients & Testing
Python
#proxy#cloudflare#scraping

seleniumbase/SeleniumBase

Python APIs for web automation, testing, and bypassing bot-detection with ease.

12.4K
Active
Python
Frontend Frameworks
#web-automation#test-automation#bot-detection

ultrafunkamsterdam/undetected-chromedriver

A customizable Selenium Chromedriver that bypasses bot mitigation systems like Distil, Imperva, and Cloudflare.

12.4K
Experimental
Python
Backend & APIs
Python
#anti-bot#anti-detection#automation

code4craft/webmagic

A scalable web crawler framework for Java developers to build custom web scrapers and data extraction tools.

11.7K
Stable
Java
API Frameworks
#crawler#scraping#framework

jhy/jsoup

A Java HTML parser library for editing, cleaning, scraping, and ensuring XSS safety of HTML content.

11.3K
Active
Java
Backend Frameworks
Java
#html-parser#css-selectors#dom-manipulation

Jack-Cherish/PythonPark

A comprehensive Python learning resource covering AI, machine learning, web scraping, and more.

11.3K
Archived
Python
Tutorials & Courses
Python
#python#machine-learning#deep-learning

injetlee/Python

Python scripts for web scraping, automating common tasks like login, Excel manipulation, and WeChat operations.

10.5K
Archived
Python
API Frameworks
#web-scraping#automation#excel

yusufkaraaslan/Skill_Seekers

Automatically convert documentation, GitHub repos, and PDFs into Claude AI skills with conflict detection.

10.2K
Active
Python
AI Code Generation
MCP Servers
Python
#ai-tools#automation#claude-ai

wangshub/Douyin-Bot

A Python bot for scraping and interacting with the Douyin (TikTok) platform, primarily for finding attractive users.

9.6K
Archived
Python
Backend Frameworks
Python
#web-scraping#social-media#automation

scrapinghub/portia

Portia is a visual scraping tool for Scrapy, a popular Python web scraping framework.

9.5K
Archived
Python
Backend Frameworks
Python
#web-scraping#data-extraction#data-pipeline

dataabc/weiboSpider

Crawls and scrapes Weibo data using Python.

9.5K
Stable
Python
React
#weibo#python#scraping

clips/pattern

A powerful Python library for web mining, natural language processing, and data visualization.

8.9K
Archived
Python
Natural Language Processing
Machine Learning
Python
#web-mining#sentiment-analysis#network-analysis

ericchiang/pup

A command-line tool for parsing HTML, useful for web scraping and data extraction tasks.

8.4K
Archived
HTML
Backend Frameworks
CLI Tools
Node.js
#web-scraping#data-extraction#html-parsing

mherrmann/helium

Helium is a lightweight Python library for web automation and scraping, built on top of Selenium.

8.2K
Active
Python
Frontend Frameworks
API Frameworks
Python
#web-automation#web-scraping#selenium

kangvcar/InfoSpider

INFO-SPIDER is an open-source web scraping toolkit that helps users retrieve data from various sources like email, e-commerce, and social platforms.

8.2K
Active
Python
Backend Frameworks
ETL & Pipelines
Python
#web-scraping#data-extraction#open-source

apify/crawlee-python

Crawlee is a powerful web scraping and browser automation library for Python to build reliable crawlers.

8.2K
Active
Python
API Clients & Testing
Backend Frameworks
Playwright
#web-scraping#crawling#automation

lorien/awesome-web-scraping

A comprehensive list of libraries, tools, and APIs for web scraping and data processing.

7.8K
Active
Makefile
Backend Frameworks
ETL & Pipelines
#web-scraping#crawling#data-processing

lining0806/PythonSpiderNotes

A Python-based web scraping library that provides code examples and techniques for various web scraping tasks.

7.4K
Archived
Python
Backend Frameworks
CLI Tools
Python
#web-scraping#python#tutorials

tabulapdf/tabula

Tabula is a tool for extracting data from PDF files, allowing developers to easily parse and extract tables.

7.3K
Experimental
CSS
API Frameworks
ETL & Pipelines
#pdf#scraping#data-extraction

alirezamika/autoscraper

A powerful, lightweight web scraping library for Python that can automate data extraction from websites.

7.1K
Experimental
Python
Backend & APIs
CLI Tools
Python
#web-scraping#automation#data-extraction
13...13

Stay in the loop

Get weekly updates on trending AI coding tools and projects.