Explore Projects

Discover 96 open source projects

Active filters (1):
Search: crawlingร—
Clear all

Showing 41-60 of 96 projects

striver-ing/wechat-spider

A WeChat spider that crawls articles, reading counts, likes, and comments from public accounts.

2.8K
Archived
Python
Express
#authentication#streaming#real-time

any4ai/AnyCrawl

AnyCrawl is a Node.js/TypeScript web scraper that extracts structured data from search engines and websites for use in AI/LLM applications.

2.8K
Active
TypeScript
LLM Wrappers & SDKs
Backend Frameworks
Node.js
#web-scraper#serp#data-extraction

xnl-h4ck3r/waymore

A Python-based tool that aggregates data from various sources to help with open-source intelligence and security research.

2.6K
Active
Python
API Frameworks
Security Research
#security#open-source-intelligence#web-scraping

oxylabs/oxylabs-ai-studio-py

AI-powered web scraping and data gathering SDK for building intelligent agents and LLM apps

2.5K
Stable
Python
LLM Frameworks
AI SDKs & Wrappers
Python
#ai-crawler#ai-scraper#web-scraping

transitive-bullshit/awesome-puppeteer

A curated list of awesome resources for the Puppeteer headless Chrome/Chromium automation library.

2.5K
Archived
CLI Tools
Backend Frameworks
#automation#headless-chrome#scraping

lorien/grab

A powerful web scraping framework for Python that supports asynchronous crawling and flexible data extraction.

2.5K
Stable
Python
Backend Frameworks
CLI Tools
Python
#web-scraping#crawling#asynchronous

dw-dengwei/daily-arXiv-ai-enhanced

Daily arXiv paper crawler with AI summaries & GitHub Pages visualization for research discovery.

2.4K
Active
JavaScript
LLM Wrappers & SDKs
Resource Collections
GitHub Pages
#arxiv-crawler#ai-summarization#research-papers

Barabama/FreeNodes

This is a Python script that crawls and automatically updates free proxy nodes for v2ray, clash, and other protocols.

2.3K
Active
Python
API Frameworks
Backend Frameworks
#proxy#v2ray#clash

spider-rs/spider

A powerful web scraping and crawling library for Rust developers

2.3K
Active
Rust
API Frameworks
CLI Tools
#automation#crawler#headless-chrome

brightdata/brightdata-mcp

A powerful MCP server that provides an all-in-one solution for public web access and data extraction.

2.2K
Active
JavaScript
MCP Servers
Backend Frameworks
Node.js
#mcp#web-scraping#data-extraction

ReaJason/xhs

A Python-based library for scraping data from the Xiaohongshu (Little Red Book) e-commerce platform.

2.0K
Experimental
Python
Backend Frameworks
Web Crawlers
#web-scraping#e-commerce#data-extraction

coleam00/mcp-crawl4ai-rag

A Python library for web crawling and RAG capabilities for AI agents and AI coding assistants.

2.0K
Experimental
Python
RAG & Vector
Agents & Orchestration
#web-crawling#rag#ai-coding-assistant

shuaidaoya/FreeNodes

A tool that automatically crawls and shares free proxy nodes for various proxy software like v2ray and clash.

2.0K
Active
Networking
General Utilities
#proxy#networking#automation

NateScarlet/holiday-cn

A Python tool for automatically scraping data on China's statutory holidays from government announcements.

1.8K
Active
Python
Web Scraping & Crawling
API Frameworks
Python
#china#holidays#data-scraping

hu17889/go_spider

A flexible and modular Go-based web crawler framework with a concurrent architecture.

1.8K
Archived
Go
API Frameworks
CLI Tools
#crawler#concurrent#pipeline

watercrawl/WaterCrawl

A versatile TypeScript-based tool that transforms web content into LLM-ready data for AI/ML applications.

1.8K
Active
TypeScript
LLM Wrappers & SDKs
Frontend Frameworks
TypeScript
#aicrawler#crawl4ai#crawler

yhangf/PythonCrawler

A collection of Python web crawling projects for developers interested in building web scrapers and spiders.

1.8K
Experimental
Python
Backend Frameworks
CLI Tools
Python
#web-scraping#web-crawling#python3

coder-hxl/x-crawl

Flexible and AI-assisted Node.js crawler library for building web scrapers and crawlers.

1.8K
Active
TypeScript
LLM Frameworks
API Frameworks
Node.js
#ai-crawl#chromium#crawler

howie6879/ruia

An async Python 3.6+ web scraping micro-framework based on asyncio for building high-performance crawlers and spiders.

1.7K
Archived
Python
Backend Frameworks
CLI Tools
Python
#web-scraping#asynchronous#asyncio

thecodrr/fdir

Fast directory crawler and globbing library for NodeJS

1.7K
Stable
TypeScript
React
#directory-crawler#globbing#fast-nodejs

Stay in the loop

Get weekly updates on trending AI coding tools and projects.