pdf-mcp
MCP server that lets Claude Code and other AI agents read large PDFs without hitting context limits. Chunked reading, hybrid search, OCR, table and image extraction, SQLite cache.
pdf-mcp is a Model Context Protocol (MCP) server that enables AI agents to read, search, and extract content from PDF files. It uses PyMuPDF for PDF parsing, SQLite for persistent caching, and supports hybrid search combining BM25 keyword and semantic embeddings, OCR for scanned documents, and structured extraction of tables and images.
Features
Compatibility
Quick start
Use cases
Alternatives
Related searches
Comments
- JJamie HarrisMay 5, 2026
Good for research workflows where Claude needs to process many large documents efficiently
- Quinn KimApr 29, 2026
Reading large PDFs without hitting context limits is a practical problem well solved here
- SSage GarciaApr 23, 2026
The chunking approach handles technical papers and long documents reliably
- SSpencer ZhangApr 14, 2026
Used for automated literature review workflows, PDF parsing accuracy is high