Explore Projects

Discover 96 open source projects

Active filters (1):
Search: crawlingร—
Clear all

Showing 81-96 of 96 projects

fugary/calibre-douban

This is a Python plugin for the Calibre ebook management tool that uses web crawling to obtain book metadata from the Douban platform.

1.3K
Stable
Python
API Frameworks
Backend Frameworks
Python
#ebook-management#book-metadata#web-scraping

tavily-ai/tavily-mcp

A production-ready MCP server with real-time search, extract, map, and crawl capabilities for vibe coders.

1.3K
Active
JavaScript
MCP Servers
Search-as-a-Service
React
#mcp#search#extract

rebrowser/rebrowser-patches

Collection of patches to avoid automation detection and captchas for web scraping and crawling tools.

1.3K
Experimental
JavaScript
Backend & APIs
CLI Tools
Puppeteer
#automation#web-scraping#headless

InstaPy/instagram-profilecrawl

A Python script that quickly crawls and extracts information from Instagram profiles.

1.3K
Archived
Python
Backend Frameworks
CLI Tools
#automation#crawler#instagram

0x676e67/rnet

An ergonomic Python HTTP client with TLS fingerprinting capabilities for web scraping and crawling.

1.2K
Active
Rust
API Clients & Testing
Backend Frameworks
Rust
#http#https#tls

istresearch/scrapy-cluster

A distributed, on-demand web scraping solution using Scrapy, Redis, and Kafka for high-performance crawling.

1.2K
Archived
Python
API Frameworks
Caching
Scrapy
#distributed-computing#web-scraping#high-performance

hengliyin/cdfang-spider

A spider to crawl and analyze housing data from the Chengdu Real Estate Association website.

1.2K
Stable
TypeScript
API Frameworks
ORMs & Query Builders
React
#web-scraping#real-estate#data-analysis

JustinBeckwith/linkinator

A TypeScript-based tool for finding and fixing broken links in websites, documentation, and local files.

1.2K
Active
TypeScript
Backend & APIs
Testing
Node.js
#broken-links#link-checker#seo

tongcheng-security-team/NextScan

NextScan is a comprehensive enterprise-level black-box vulnerability scanning system that integrates vulnerability scanning, management, asset scanning, and crawling services.

1.2K
Archived
JavaScript
Security Research
Backend Frameworks
Next.js
#vulnerability-scanning#penetration-testing#enterprise-security

jayus0821/swagger-hack

A Python tool that automates the crawling and testing of all Swagger API endpoints.

1.2K
Stable
Python
API Clients & Testing
CLI Tools
#swagger#api-testing#automation

needleworm/bhban_rpa

This Python-based repository provides examples and code for automating various office tasks like Excel, design, and web scraping.

1.1K
Active
Python
CLI Tools
Tutorials & Courses
#automation#rpa#excel

s045pd/DarkNet_ChineseTrading

This is a Python script for monitoring and crawling the Chinese Darknet, not a developer discovery platform.

1.1K
Archived
Python
Security Research
API Frameworks
#darknet#crawler#monitoring

ChrisRx/dungeonfs

A FUSE filesystem and dungeon crawling adventure game engine built in Go.

1.1K
Stable
Go
API Frameworks
CLI Tools
Go
#fuse#filesystem#game

elixir-crawly/crawly

Crawly is a high-level web crawling and scraping framework for Elixir, enabling developers to extract data from websites efficiently.

1.1K
Experimental
Elixir
Backend Frameworks
Caching
#crawler#crawling#scraper

facebookresearch/cc_net

Tools to download and cleanup Common Crawl data, a large web crawl dataset, for further analysis and processing.

1.0K
Archived
Python
ETL & Pipelines
CLI Tools
Python
#data-processing#web-crawling#data-cleanup

m-sec-org/EZ

EZ is a cross-platform vulnerability scanner that combines information gathering, port scanning, service brute-forcing, URL crawling, and fingerprinting.

1.0K
Archived
Security Research
CLI Tools
#vulnerability-scanning#penetration-testing#information-gathering

Stay in the loop

Get weekly updates on trending AI coding tools and projects.