Explore Projects

Discover 53 open source projects

Active filters (1):
Search: web-scrapingร—
Clear all

Showing 1-20 of 53 projects

firecrawl/firecrawl

Convert websites into LLM-ready data with API for scraping, crawling, and structured data extraction

88.5K
Active
TypeScript
Web Scraping AI
Agents & Orchestration
TypeScript
#ai-scraping#web-crawler#llm-data

scrapy/scrapy

Scrapy is a fast, high-level web crawling and scraping framework for Python developers.

60.6K
Active
Python
Testing
Python
#web-scraping#crawling#python

Mintplex-Labs/anything-llm

All-in-one AI app for local and remote LLM usage with RAG, agents, and MCP compatibility

55.6K
Active
JavaScript
MCP Servers
Agent Coordination
JavaScript
#ai-agents#local-llm#rag

dgtlmoon/changedetection.io

Website change detection and monitoring tool with notifications

30.5K
Active
Python
Testing
Monitoring
#change-detection#notifications#web-monitoring

D4Vinci/Scrapling

Powerful, flexible Python library for effortless web scraping with AI-powered features.

23.6K
Active
Python
Web Scraping
Backend Frameworks
Python
#web-scraping#automation#data-extraction

ScrapeGraphAI/Scrapegraph-ai

AI-powered web scraping library for extracting data from websites and documents

22.9K
Active
Python
Web Scraping AI
RAG & Vector
Python
#ai-scraping#llm#rag

apify/crawlee

Web scraping and browser automation library for Node.js

22.0K
Active
TypeScript
Browser Automation SDKs
Testing
Node.js
#web-scraping#browser-automation#nodejs

Evil0ctal/Douyin_TikTok_Download_API

A high-performance async web scraping tool for extracting data from Douyin, TikTok, Bilibili and more.

16.5K
Stable
Python
API Frameworks
FastAPI
#api#async#scraper

getmaxun/maxun

Turn websites into clean data pipelines & structured APIs in minutes with a low-code web scraping tool.

15.2K
Active
TypeScript
API Clients & Testing
React
#web-scraping#automation#no-code

seleniumbase/SeleniumBase

Python APIs for web automation, testing, and bypassing bot-detection with ease.

12.4K
Active
Python
Frontend Frameworks
#web-automation#test-automation#bot-detection

yusufkaraaslan/Skill_Seekers

Automatically convert documentation, GitHub repos, and PDFs into Claude AI skills with conflict detection.

10.2K
Active
Python
AI Code Generation
MCP Servers
Python
#ai-tools#automation#claude-ai

mherrmann/helium

Helium is a lightweight Python library for web automation and scraping, built on top of Selenium.

8.2K
Active
Python
Frontend Frameworks
API Frameworks
Python
#web-automation#web-scraping#selenium

apify/crawlee-python

Crawlee is a powerful web scraping and browser automation library for Python to build reliable crawlers.

8.2K
Active
Python
API Clients & Testing
Backend Frameworks
Playwright
#web-scraping#crawling#automation

lorien/awesome-web-scraping

A comprehensive list of libraries, tools, and APIs for web scraping and data processing.

7.8K
Active
Makefile
Backend Frameworks
ETL & Pipelines
#web-scraping#crawling#data-processing

alirezamika/autoscraper

A powerful, lightweight web scraping library for Python that can automate data extraction from websites.

7.1K
Experimental
Python
Backend & APIs
CLI Tools
Python
#web-scraping#automation#data-extraction

go-rod/rod

A Go library for automating and scraping websites using the Chrome DevTools Protocol.

6.8K
Stable
Go
Backend Frameworks
Testing
#automation#web-scraping#chrome-devtools

autoscrape-labs/pydoll

Pydoll is a Python library for automating chromium-based browsers without a WebDriver, offering realistic interactions.

6.6K
Active
Python
Frontend Frameworks
API Frameworks
#automation#browser-automation#scraping

firecrawl/firecrawl-mcp-server

Firecrawl MCP Server adds powerful web scraping and search capabilities to AI language models like Cursor and Claude.

5.7K
Active
JavaScript
MCP Servers
LLM Wrappers & SDKs
JavaScript
#web-scraping#search-api#llm-integration

adbar/trafilatura

Gathers text and metadata from the web using crawling, scraping, and extraction techniques.

5.4K
Stable
Python
React
#web-scraping#text-extraction#metadata-gathering

lexiforest/curl_cffi

A Python library that can impersonate browser fingerprints for web scraping and HTTP requests.

5.1K
Active
Python
Backend & APIs
CLI Tools
Python
#web-scraping#http-client#fingerprinting

Stay in the loop

Get weekly updates on trending AI coding tools and projects.