Explore Projects

Discover 101 open source projects

Active filters (1):
Search: spider×
Clear all

Showing 61-80 of 101 projects

yhangf/PythonCrawler

A collection of Python web crawling projects for developers interested in building web scrapers and spiders.

1.8K
Experimental
Python
Backend Frameworks
CLI Tools
Python
#web-scraping#web-crawling#python3

howie6879/ruia

An async Python 3.6+ web scraping micro-framework based on asyncio for building high-performance crawlers and spiders.

1.7K
Archived
Python
Backend Frameworks
CLI Tools
Python
#web-scraping#asynchronous#asyncio

DanMcInerney/xsscrapy

An open-source web crawler and spider tool for detecting cross-site scripting (XSS) vulnerabilities.

1.7K
Archived
Python
Security Research
CLI Tools
Python
#web-crawler#xss-detection#penetration-testing

librauee/Reptile

A comprehensive Python web scraping library covering a wide range of popular websites and platforms.

1.7K
Archived
Python
Backend Frameworks
CLI Tools
#web-scraping#python3#requests

TheWebScrapingClub/webscraping-from-0-to-hero

A comprehensive resource for learning web scraping with Python, covering tools like Playwright, Scrapy, and Splash.

1.7K
Archived
Backend Frameworks
CLI Tools
Python
#web-scraping#python#playwright

ShunCai/QZoneExport

A Chrome extension to backup and export data from QQ Zone, including posts, albums, and more.

1.7K
Archived
JavaScript
Chrome
Backup
Chrome
#backup#export#qqzone

Python3Spiders/WeiboSuperSpider

A powerful Python-based web crawler and toolkit for scraping Weibo data, including user profiles, comments, images, and more.

1.7K
Archived
Python
Backend Frameworks
Databases
Python
#weibo#web-scraping#data-extraction

chriskite/anemone

Anemone is a Ruby web-spider framework for crawling and extracting data from websites.

1.6K
Archived
Ruby
Backend Frameworks
CLI Tools
Ruby
#web-crawler#data-extraction#web-scraping

srx-2000/spider_collection

A collection of Python web scraping scripts for various websites and platforms, including music, video, and real estate data.

1.6K
Archived
Python
Backend Frameworks
ETL & Pipelines
#web-scraping#data-extraction#python-scripts

ArchiveTeam/grab-site

A web crawler tool that outputs WARC files and provides a dashboard for managing crawls.

1.6K
Experimental
Python
CLI Tools
Backend Frameworks
Python
#archiving#crawling#web-scraping

keenwon/antcolony

A Node.js-based web scraper and crawler for finding and downloading torrent files.

1.5K
Archived
JavaScript
Backend Frameworks
API Frameworks
Node
#web-scraping#torrent#bittorrent

OreosLab/checkinpanel

A check-in panel for developers using AI tools, supporting various platforms and environments.

1.4K
Archived
Perl
React
#authentication#streaming#real-time

darbra/sperm

This is a collection of interesting reverse engineering articles worth checking out.

1.4K
Stable
Crawling & Scraping
Reverse Engineering
#crawl#crawler#frida

monperrus/crawler-user-agents

A Go library that provides a database of syntactic patterns of HTTP user-agents used by bots/crawlers/scrapers.

1.4K
Active
Go
CLI Tools
API Frameworks
#user-agent#crawler#bot

xisuo67/XHS-Spider

A web scraping and data collection tool for the Chinese social media platform Xiaohongshu (Little Red Book).

1.4K
Stable
Backend Frameworks
API Frameworks
C#
#crawler#downloader#scraper

mvdbos/php-spider

A configurable and extensible PHP web spider for crawling and extracting data from websites.

1.3K
Active
PHP
Backend Frameworks
CLI Tools
#web-scraping#crawling#data-extraction

kgspider/crawler

A JavaScript web scraper repository focused on reverse engineering and advanced crawling techniques.

1.3K
Experimental
JavaScript
Backend Frameworks
CLI Tools
Node.js
#crawler#web-scraper#reverse-engineering

alongubkin/spider

A now-inactive JavaScript library for creating simple, unsurprising web crawling and scraping tools.

1.3K
Archived
JavaScript
Backend Frameworks
CLI Tools
Node
#web-crawling#scraping#automation

erma0/douyin

A TypeScript-based web scraper for collecting public data from the Douyin (TikTok) platform.

1.3K
Active
TypeScript
Backend Frameworks
Caching
TypeScript
#web-scraper#douyin#tiktok

okfn-brasil/querido-diario

An open-source project that provides access to Brazilian government gazettes, enabling civic tech and data science applications.

1.3K
Stable
Python
Backend Frameworks
Databases
Python
#open-data#scraping#govtech

Stay in the loop

Get weekly updates on trending AI coding tools and projects.