Explore Projects

Discover 352 open source projects

Active filters (1):

Search: pdf×

Showing 61-80 of 352 projects

windingwind/zotero-pdf-translate

A TypeScript plugin for Zotero that allows translation of PDF, EPUB, webpages, metadata, annotations, and notes to the target language.

10.4K

Active

TypeScript

MCP Frameworks

React

#pdf#translation#zotero

yusufkaraaslan/Skill_Seekers

Automatically convert documentation, GitHub repos, and PDFs into Claude AI skills with conflict detection.

10.2K

Active

Python

AI Code Generation

MCP Servers

Python

#ai-tools#automation#claude-ai

iamgio/quarkdown

Quarkdown is a powerful Markdown-based tool for creating documents, presentations, websites, and knowledge bases.

10.2K

Active

Kotlin

Static Site Generators

Kotlin

#markdown#markup-language#paper

Kareadita/Kavita

A fast, feature-rich, cross-platform reading server for managing and sharing your digital reading collection.

10.0K

Active

Backend Frameworks

#comics#manga#epub

py-pdf/pypdf

A pure-Python library for manipulating PDF documents, including splitting, merging, cropping, and transforming pages.

9.8K

Active

Python

API Frameworks

#pdf#pdf-manipulation#pdf-parser

jsvine/pdfplumber

A Python library that provides a powerful API for extracting text and tables from PDF files.

9.8K

Active

Python

API Frameworks

Python

#pdf#pdf-parsing#table-extraction

tpn/pdfs

A comprehensive collection of technically-oriented PDF resources for developers, including papers, specs, and manuals.

9.5K

Stable

HTML

Books & Guides

#technical-papers#developer-resources#documentation

opendatalab/PDF-Extract-Kit

A comprehensive toolkit for high-quality PDF content extraction, focused on developer needs.

9.4K

Archived

Python

API Frameworks

Caching

Python

#pdf#extraction#parsing

hacksalot/HackMyResume

Generates polished resumes and CVs in various formats for developers.

9.4K

Archived

JavaScript

Documentation

JavaScript

#resume-generator#cv-builder#html-export

pymupdf/PyMuPDF

A high-performance Python library for data extraction, analysis, conversion and manipulation of PDF and other documents.

9.2K

Active

Python

Document Processing

#pdf#data-extraction#text-processing

bytedance/Dolphin

Dolphin is a document image parsing library that uses heterogeneous anchor prompting for OCR and layout analysis.

8.9K

Stable

Python

Computer Vision

API Frameworks

Python

#document-analysis#layout-analysis#ocr

iib0011/omni-tools

Self-hosted collection of powerful web-based tools for everyday tasks for developers.

8.8K

Stable

TypeScript

Web Development

React

#developer-tools#productivity#self-hosted

ahrm/sioyek

Sioyek is a powerful PDF viewer focused on textbooks and research papers, with a range of features for developers.

8.8K

Active

Frontend Frameworks

CLI Tools

React

#pdf#pdf-viewer#research-paper

Kozea/WeasyPrint

WeasyPrint is a powerful Python library for converting HTML and CSS to PDF, ideal for developers building document-centric applications.

8.7K

Active

Python

Backend Frameworks

API Frameworks

Python

#pdf#css#html

pdfcpu/pdfcpu

A high-performance PDF processor written in Go for tasks like parsing, manipulating, and converting PDF files.

8.5K

Active

API Frameworks

Backend Frameworks

#pdf#pdf-processing#pdf-library

Hopding/pdf-lib

A TypeScript library for creating and modifying PDF documents in any JavaScript environment.

8.3K

Archived

TypeScript

API Frameworks

Frontend Frameworks

TypeScript

#pdf#document-manipulation#typescript

apify/crawlee-python

Crawlee is a powerful web scraping and browser automation library for Python to build reliable crawlers.

8.2K

Active

Python

API Clients & Testing

Backend Frameworks

Playwright

#web-scraping#crawling#automation

mfts/papermark

Papermark is an open-source DocSend alternative with built-in analytics and custom domains.

8.1K

Active

TypeScript

Component Libraries (React)

Authentication

Next.js

#dataroom#pdf#next-auth

PHPOffice/PHPWord

A pure PHP library for reading and writing word processing documents like DOCX, ODT, and PDF

7.5K

Experimental

PHP

Backend Frameworks

API Frameworks

#doc#docx#html

tabulapdf/tabula

Tabula is a tool for extracting data from PDF files, allowing developers to easily parse and extract tables.

7.3K

Experimental

CSS

API Frameworks

ETL & Pipelines

#pdf#scraping#data-extraction

1...35...18

Stay in the loop

Get weekly updates on trending AI coding tools and projects.