Explore Projects

Discover 170 open source projects

Active filters (1):
Search: crawlers×
Clear all

Showing 61-80 of 170 projects

Boris-code/feapder

A powerful Python-based web scraping framework with features like browser rendering and data deduplication.

3.6K
Stable
Python
Backend Frameworks
CLI Tools
Python
#crawler#scraper#web-scraping

ScottSloan/Bili23-Downloader

A cross-platform Bilibili video downloader with multi-thread acceleration and resumable downloads.

3.6K
Stable
Python
Backend Frameworks
API Frameworks
Python
#bilibili#video-downloader#cross-platform

elliotgao2/toapi

A Python library that makes it easy to create APIs for any website, simplifying web scraping and API development.

3.6K
Archived
Python
API Clients & Testing
Backend Frameworks
#web-scraping#api-development#web-crawling

Gerapy/Gerapy

A distributed crawler management framework based on Scrapy, Scrapyd, Django, and Vue.js for web scraping.

3.5K
Archived
Python
API Frameworks
Backend Frameworks
Django
#scrapy#scrapyd#django

google/robotstxt

A C++ library for parsing and matching Google's robots.txt files, useful for web crawlers and bots.

3.5K
Archived
C++
API Frameworks
Backend Frameworks
#robots.txt#parsing#matching

ngc660sec/NGCBot

A feature-rich WeChat bot with AI-powered capabilities for developers, including security news, vulnerability lookup, and more.

3.3K
Experimental
Agents & Orchestration
Authentication
#bot#security#wechat

jasonxtn/Argus

A comprehensive toolkit for information gathering and reconnaissance, including OSINT, web crawling, and more.

3.3K
Stable
Python
CLI Tools
Security Research
Python
#osint#information-gathering#reconnaissance

internetarchive/heritrix3

Heritrix is an open-source, extensible web crawler for archiving websites at scale.

3.2K
Active
Java
Backend Frameworks
ETL & Pipelines
#web-crawling#warc#java

apache/nutch

Apache Nutch is an extensible and scalable web crawler for building search engines and data mining applications.

3.1K
Active
Java
API Frameworks
Backend Frameworks
#apache#crawling#hadoop

CrawlScript/WebCollector

An open-source web crawler framework written in Java that makes it easy to build multi-threaded web crawlers.

3.1K
Stable
Java
API Frameworks
CLI Tools
#web-crawler#multi-threaded#open-source

jaeles-project/gospider

Fast web spider written in Go for bug bounty

2.9K
Archived
Go
AI Coding Assistants
#gospider#crawler#Go

JAVClub/core

This repository is a web crawler for adult video streaming and torrent sites, not a developer tool.

2.9K
Archived
JavaScript
Uncategorized
#adult-content#web-scraping#torrent

CharlesPikachu/DecryptLogin

Comprehensive Python library for logging in to various websites using the requests library.

2.9K
Archived
Python
API Clients & Testing
Backend Frameworks
#web-scraping#authentication#login

NikolaiT/GoogleScraper

A Python module to scrape several search engines, including asynchronous networking support.

2.8K
Archived
HTML
Python
#search-engine-scraping#asynchronous-networking#python-module

brianway/webporter

A Java web crawler application built using the webmagic framework, focused on indexing content from Zhihu.

2.8K
Archived
Java
Backend Frameworks
Search
#web-crawler#elasticsearch#zhihu

geziyor/geziyor

Geziyor is a fast web crawling and scraping framework for Go that supports JavaScript rendering.

2.8K
Experimental
Go
API Frameworks
CLI Tools
#crawler#scraper#web-scraping

facundoolano/google-play-scraper

A Node.js library for scraping data from the Google Play Store.

2.8K
Stable
JavaScript
API Clients & Testing
Backend Frameworks
Node.js
#google-play#scraper#data-extraction

any4ai/AnyCrawl

AnyCrawl is a Node.js/TypeScript web scraper that extracts structured data from search engines and websites for use in AI/LLM applications.

2.8K
Active
TypeScript
LLM Wrappers & SDKs
Backend Frameworks
Node.js
#web-scraper#serp#data-extraction

jae-jae/QueryList

A progressive PHP crawler framework that allows developers to build elegant web scrapers and crawlers.

2.7K
Experimental
PHP
Backend Frameworks
ETL & Pipelines
#crawler#scraper#spider

loadchange/amemv-crawler

A Python library for easily downloading videos from TikTok (Douyin), useful for content creators and researchers.

2.6K
Archived
Python
API Clients & Testing
Backend Frameworks
Python
#tiktok#video-download#content-scraping
1...35...9

Stay in the loop

Get weekly updates on trending AI coding tools and projects.