Explore Projects

Discover 255 open source projects

Active filters (1):
Search: scrape×
Clear all

Showing 1-20 of 255 projects

firecrawl/firecrawl

Convert websites into LLM-ready data with API for scraping, crawling, and structured data extraction

88.5K
Active
TypeScript
Web Scraping AI
Agents & Orchestration
TypeScript
#ai-scraping#web-crawler#llm-data

scrapy/scrapy

Scrapy is a fast, high-level web crawling and scraping framework for Python developers.

60.6K
Active
Python
Testing
Python
#web-scraping#crawling#python

soimort/you-get

Command-line media downloader for the Web

56.8K
Experimental
Python
CLI Tools
#cli#downloader#media

Mintplex-Labs/anything-llm

All-in-one AI app for local and remote LLM usage with RAG, agents, and MCP compatibility

55.6K
Active
JavaScript
MCP Servers
Agent Coordination
JavaScript
#ai-agents#local-llm#rag

NanmiCoder/MediaCrawler

Multi-platform social media crawler for content and comment scraping

44.9K
Active
Python
Testing
Web Scraping AI
Python
#web-scraping#social-media#data-mining

dgtlmoon/changedetection.io

Website change detection and monitoring tool with notifications

30.5K
Active
Python
Testing
Monitoring
#change-detection#notifications#web-monitoring

feder-cr/Jobs_Applier_AI_Agent_AIHawk

AIHawk automates job applications using AI for tailored submissions.

29.4K
Stable
Python
Agents & Orchestration
Browser Agents
Python
#ai-agent#job-automation#chrome-automation

gocolly/colly

Go Colly - Elegant Scraper and Crawler Framework for Golang

25.1K
Active
Go
CLI Tools
Backend Frameworks
Go
#golang#scraper#crawler

D4Vinci/Scrapling

Powerful, flexible Python library for effortless web scraping with AI-powered features.

23.6K
Active
Python
Web Scraping
Backend Frameworks
Python
#web-scraping#automation#data-extraction

jhao104/proxy_pool

Python proxy pool for web scraping with Redis storage

23.2K
Stable
Python
Testing
Backend Frameworks
Python
#proxy#web-scraping#redis

ScrapeGraphAI/Scrapegraph-ai

AI-powered web scraping library for extracting data from websites and documents

22.9K
Active
Python
Web Scraping AI
RAG & Vector
Python
#ai-scraping#llm#rag

apify/crawlee

Web scraping and browser automation library for Node.js

22.0K
Active
TypeScript
Browser Automation SDKs
Testing
Node.js
#web-scraping#browser-automation#nodejs

soxoj/maigret

A comprehensive OSINT tool for collecting information on individuals from various online sources.

19.1K
Active
Python
OSINT
#osint#cybersecurity#reconnaissance

dzhng/deep-research

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models.

18.5K
Stable
TypeScript
Agents & Orchestration
TypeScript
#ai#research#web-scraping

binux/pyspider

A powerful web crawler system in Python for building custom web scraping solutions.

17.0K
Archived
Python
Backend Frameworks
Python
#web-crawler#web-scraping#python

Evil0ctal/Douyin_TikTok_Download_API

A high-performance async web scraping tool for extracting data from Douyin, TikTok, Bilibili and more.

16.5K
Stable
Python
API Frameworks
FastAPI
#api#async#scraper

twintproject/twint

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API

16.3K
Archived
Python
API Clients & Testing
#twitter#scraping#osint

Kr1s77/awesome-python-login-model

A Python script for simulating logins on various websites and scraping data with Selenium.

16.2K
Archived
Python
Selenium
#authentication#streaming#real-time

getmaxun/maxun

Turn websites into clean data pipelines & structured APIs in minutes with a low-code web scraping tool.

15.2K
Active
TypeScript
API Clients & Testing
React
#web-scraping#automation#no-code

psf/requests-html

A Pythonic HTML parsing library that simplifies web scraping and interaction with HTTP resources.

13.9K
Archived
Python
Backend Frameworks
#web-scraping#http-client#html-parsing
2...13

Stay in the loop

Get weekly updates on trending AI coding tools and projects.