Explore Projects

Discover 96 open source projects

Active filters (1):
Search: crawling×
Clear all

Showing 61-80 of 96 projects

Python3Spiders/WeiboSuperSpider

A powerful Python-based web crawler and toolkit for scraping Weibo data, including user profiles, comments, images, and more.

1.7K
Archived
Python
Backend Frameworks
Databases
Python
#weibo#web-scraping#data-extraction

chriskite/anemone

Anemone is a Ruby web-spider framework for crawling and extracting data from websites.

1.6K
Archived
Ruby
Backend Frameworks
CLI Tools
Ruby
#web-crawler#data-extraction#web-scraping

rushter/selectolax

A fast HTML5 parser with CSS selectors for Python, useful for web scraping and crawling tasks.

1.6K
Active
Cython
Frontend Frameworks
API Frameworks
Python
#web-scraping#html-parsing#css-selectors

ArchiveTeam/grab-site

A web crawler tool that outputs WARC files and provides a dashboard for managing crawls.

1.6K
Experimental
Python
CLI Tools
Backend Frameworks
Python
#archiving#crawling#web-scraping

github/lightcrawler

A JavaScript tool that crawls a website and runs it through Google Lighthouse for performance, accessibility, and best practices analysis.

1.5K
Archived
JavaScript
Frontend Frameworks
CLI Tools
Node.js
#web-crawler#web-performance#accessibility-testing

s0md3v/uro

A Python library that declutters URL lists for web crawling and penetration testing tasks.

1.5K
Experimental
Python
Penetration Testing
CLI Tools
Python
#web-crawling#penetration-testing#url-manipulation

Leon406/SubCrawler

A Kotlin-based open-source tool that automatically crawls and tests public network nodes for use in privacy-focused applications.

1.5K
Active
Kotlin
Monitoring
Backend Frameworks
#shadowsocks#ssr#trojan

roach-php/core

A comprehensive web scraping toolkit for PHP developers, with capabilities for crawling, parsing, and extracting data from websites.

1.5K
Stable
PHP
Backend Frameworks
API Frameworks
#crawling#web-scraping#php

sethblack/python-seo-analyzer

A Python-based SEO analysis tool that crawls websites, counts words, and checks for technical SEO issues.

1.4K
Active
Python
Backend Frameworks
CLI Tools
Python
#seo#web-crawler#technical-seo

openwpm/OpenWPM

OpenWPM is a web privacy measurement framework written in Python that can be used to crawl and analyze websites.

1.4K
Experimental
Python
Backend & APIs
CLI Tools
Python
#crawler#firefox#privacy

zhuweiyou/weixin-game-helper

A collection of helper tools for various WeChat mini-games, with features like AI-powered game bots and data crawling.

1.4K
Archived
JavaScript
API Frameworks
Backend Frameworks
Node.js
#wechat#mini-games#game-bots

chiphuyen/sotawhat

A Python tool that helps developers stay up-to-date with the latest AI research by summarizing Arxiv paper abstracts.

1.4K
Archived
Python
LLM Wrappers & SDKs
CLI Tools
Python
#arxiv#research-tool#summarization

darbra/sperm

This is a collection of interesting reverse engineering articles worth checking out.

1.4K
Stable
Crawling & Scraping
Reverse Engineering
#crawl#crawler#frida

lorey/mlscraper

Effortlessly scrape data from websites using machine learning and HTML examples with this Python library.

1.4K
Archived
Python
Backend Frameworks
ETL & Pipelines
#web-scraping#data-extraction#machine-learning

mvdbos/php-spider

A configurable and extensible PHP web spider for crawling and extracting data from websites.

1.3K
Active
PHP
Backend Frameworks
CLI Tools
#web-scraping#crawling#data-extraction

kgspider/crawler

A JavaScript web scraper repository focused on reverse engineering and advanced crawling techniques.

1.3K
Experimental
JavaScript
Backend Frameworks
CLI Tools
Node.js
#crawler#web-scraper#reverse-engineering

alongubkin/spider

A now-inactive JavaScript library for creating simple, unsurprising web crawling and scraping tools.

1.3K
Archived
JavaScript
Backend Frameworks
CLI Tools
Node
#web-crawling#scraping#automation

scrapinghub/frontera

A scalable frontier for web crawlers, focused on high-performance and scalable web crawling.

1.3K
Experimental
Python
API Frameworks
Backend Frameworks
Python
#web-crawling#distributed-systems#high-performance

imgbot/Imgbot

An Azure Function to crawl GitHub repos and losslessly compress images, reducing file size while maintaining quality.

1.3K
Archived
C#
File Storage
Build Tools
Azure
#image-optimization#github-integration#azure-functions

Darwin-lfl/langmanus

A community-driven AI automation framework that combines language models with specialized tools for tasks like web search, crawling, and Python code execution.

1.3K
Experimental
LLM Frameworks
API Frameworks
#ai-automation#web-crawling#python-execution

Stay in the loop

Get weekly updates on trending AI coding tools and projects.