Skip to main content

Docsray MCP

Advanced Document Perception for Claude - Extract Everything from Any Document

# Install
pip install docsray-mcp

# Run with uvx (recommended for MCP clients)
uvx docsray-mcp

# Configure in Claude Desktop or Cursor, then use:
"Xray document.pdf with provider llama-parse"
🎯

Comprehensive Extraction

Extract EVERYTHING from documents - text, tables, images, entities, layouts, metadata, and more with a single command using AI-powered analysis.

πŸ”„

Multi-Provider Support

Choose between LlamaParse for deep AI analysis or PyMuPDF for lightning-fast extraction. Auto-selection picks the best provider for your needs.

⚑

Intelligent Caching

All extractions are cached locally for instant retrieval. Process once, access instantly forever. Smart invalidation when documents change.

πŸ€–

MCP Native

Built specifically for Claude and MCP ecosystem. Five powerful tools that work seamlessly with natural language prompts.

βœ…

Production Ready

52+ passing tests, comprehensive error handling, timeout protection, and battle-tested with real-world documents.

πŸ“„

Universal Format Support

PDF, DOCX, PPTX, XLSX, HTML, Markdown, and more. Works with local files and URLs. Handles everything from invoices to research papers.

Choose Your Provider

ProviderSpeedCapabilitiesBest For
LlamaParse 🧠5-30sAI analysis, entities, tables, images, layouts, custom instructionsComprehensive extraction, deep analysis
PyMuPDF ⚑<1sText, basic markdown, fast extractionQuick text retrieval, simple documents

Maximum Data Extraction

Get EVERYTHING with one prompt in Claude:

# Ask Claude to analyze your document:
"Xray document.pdf with provider llama-parse and extract:
    1) Complete text content preserving exact formatting
    2) All tables with complete data and structure
    3) All images with descriptions and metadata
    4) Complete document metadata
    5) Full document structure with all sections
    6) All form fields and values
    7) All hyperlinks and cross-references
    8) All mathematical equations
    9) Page-by-page layout information
    10) All entity recognition (people, orgs, dates, amounts)
    """
)

# Returns EVERYTHING in result['full_extraction']

Five Powerful Tools

πŸ” Peek

Quick overview, metadata, available formats

πŸ—ΊοΈ Map

Complete document structure and hierarchy

🩻 Xray

Deep AI analysis, entities, comprehensive extraction

πŸ“ Extract

Get content in markdown, JSON, or text

🎯 Seek

Navigate to specific pages or sections