Tools API Reference

Complete API reference for all Docsray MCP tools with parameters, responses, and examples.

docsray_peek

Get quick document overview and metadata.

Parameters

Parameter	Type	Required	Default	Description
`document_url`	string	Yes	-	Path or URL to document
`depth`	string	No	"structure"	Analysis depth: "metadata", "structure", "preview"
`provider`	string	No	"auto"	Provider: "auto", "pymupdf4llm", "llama-parse"

Response Schema

{
  "metadata": {
    "title": "string",
    "author": "string", 
    "subject": "string",
    "creator": "string",
    "creation_date": "string (ISO 8601)",
    "modification_date": "string (ISO 8601)",
    "page_count": "integer",
    "format": "string",
    "file_size": "integer (bytes)",
    "has_images": "boolean",
    "has_tables": "boolean",
    "is_encrypted": "boolean",
    "language": "string"
  },
  "structure": {  // Only if depth >= "structure"
    "outline": [
      {
        "title": "string",
        "page": "integer",
        "level": "integer"
      }
    ],
    "sections": [
      {
        "type": "string",
        "content": "string", 
        "page": "integer"
      }
    ],
    "page_info": [
      {
        "page": "integer",
        "type": "string",
        "elements": "integer"
      }
    ]
  },
  "preview": {  // Only if depth = "preview"
    "first_page": "string",
    "sample_content": "string",
    "key_sections": ["string"],
    "preview_length": "integer",
    "total_length": "integer"
  },
  "provider": "string"
}

Examples

# Basic metadata
result = docsray.peek("document.pdf", depth="metadata")

# Structure analysis
result = docsray.peek("document.pdf", depth="structure")

# Full preview with specific provider
result = docsray.peek("document.pdf", depth="preview", provider="llama-parse")

docsray_map

Generate comprehensive document structure map.

Parameters

Parameter	Type	Required	Default	Description
`document_url`	string	Yes	-	Path or URL to document
`include_content`	boolean	No	false	Include content snippets
`analysis_depth`	string	No	"deep"	Depth: "basic", "deep", "comprehensive"
`provider`	string	No	"auto"	Provider selection

Response Schema

{
  "structure": {
    "outline": [
      {
        "title": "string",
        "level": "integer",
        "page": "integer",
        "section_id": "string",
        "children": ["object"] // Nested outline items
      }
    ],
    "page_map": [
      {
        "page": "integer",
        "sections": ["string"], // Section IDs
        "content_types": ["string"],
        "element_count": "integer"
      }
    ],
    "sections": [  // Only if include_content = true
      {
        "id": "string",
        "title": "string", 
        "page_start": "integer",
        "page_end": "integer",
        "content_preview": "string",
        "content_length": "integer",
        "subsections": ["string"]
      }
    ],
    "navigation": {
      "total_sections": "integer",
      "max_depth": "integer",
      "cross_references": [
        {
          "from": "string",
          "to": "string",
          "type": "string"
        }
      ]
    },
    "content_distribution": {  // Only if analysis_depth >= "comprehensive"
      "text_pages": "integer",
      "table_pages": "integer", 
      "image_pages": "integer",
      "mixed_pages": "integer"
    }
  },
  "metadata": {
    "total_pages": "integer",
    "processing_time": "number",
    "analysis_depth": "string"
  },
  "provider": "string"
}

Examples

# Basic structure map
result = docsray.map("document.pdf")

# Comprehensive analysis with content
result = docsray.map("document.pdf", 
                    include_content=True,
                    analysis_depth="comprehensive")

docsray_xray

AI-powered comprehensive document analysis.

Parameters

Parameter	Type	Required	Default	Description
`document_url`	string	Yes	-	Path or URL to document
`analysis_type`	array	No	["entities", "key-points"]	Analysis types to perform
`custom_instructions`	string	No	null	Detailed analysis instructions
`provider`	string	No	"llama-parse"	Provider (LlamaParse recommended)

Analysis Types

"entities" - Extract people, organizations, dates, amounts, etc.
"relationships" - Map connections between entities
"key-points" - Identify main ideas and findings
"sentiment" - Analyze document tone
"structure" - Deep structural analysis

Response Schema

{
  "analysis": {
    "extracted_content": {
      "entities": [
        {
          "type": "string", // PERSON, ORGANIZATION, DATE, MONETARY, etc.
          "value": "string",
          "context": "string",
          "page": "integer",
          "confidence": "number (0-1)"
        }
      ],
      "key_points": [
        {
          "point": "string",
          "importance": "string", // high, medium, low
          "page": "integer", 
          "supporting_evidence": ["string"]
        }
      ],
      "relationships": [  // Only if "relationships" in analysis_type
        {
          "entity1": "string",
          "entity2": "string", 
          "relationship": "string",
          "confidence": "number (0-1)"
        }
      ],
      "sentiment": {  // Only if "sentiment" in analysis_type
        "overall_tone": "string", // positive, negative, neutral
        "confidence_level": "string", // high, medium, low
        "key_indicators": ["string"]
      }
    },
    "full_extraction": {
      "documents": ["object"], // Complete document data
      "pages": ["object"],     // Page-by-page content
      "images": ["object"],    // Image extractions with descriptions
      "tables": ["object"]     // Structured table data
    },
    "summary": {
      "total_entities": "integer",
      "entity_types": ["string"],
      "key_points_count": "integer", 
      "confidence_score": "number (0-1)",
      "processing_time": "number"
    }
  },
  "provider": "string"
}

Examples

# Basic entity extraction
result = docsray.xray("document.pdf", analysis_type=["entities"])

# Comprehensive analysis with custom instructions
result = docsray.xray("contract.pdf",
                     analysis_type=["entities", "relationships", "key-points"],
                     custom_instructions="Extract all parties, dates, and obligations")

# Full analysis
result = docsray.xray("document.pdf",
                     analysis_type=["entities", "relationships", "key-points", "sentiment"])

docsray_extract

Extract document content in multiple formats.

Parameters

Parameter	Type	Required	Default	Description
`document_url`	string	Yes	-	Path or URL to document
`extraction_targets`	array	No	["text"]	Content types to extract
`output_format`	string	No	"markdown"	Output format
`pages`	array	No	null	Specific pages to extract
`provider`	string	No	"auto"	Provider selection

Extraction Targets

"text" - Plain text content
"tables" - Structured table data
"images" - Image extraction and descriptions
"metadata" - Document properties
"structure" - Document hierarchy

Output Formats

"markdown" - Formatted markdown with structure
"text" - Plain text without formatting
"json" - Structured JSON output
"html" - HTML with preserved formatting

Response Schema

{
  "extraction": {
    "text": "string",           // Plain text content
    "markdown": "string",       // Formatted markdown
    "html": "string",          // HTML format (if requested)
    "word_count": "integer",
    "character_count": "integer",
    "page_count": "integer",
    "tables": [  // Only if "tables" in extraction_targets
      {
        "page": "integer",
        "html": "string",
        "data": {
          "headers": ["string"],
          "rows": [["string"]]
        }
      }
    ],
    "images": [  // Only if "images" in extraction_targets
      {
        "page": "integer", 
        "description": "string",
        "metadata": {
          "width": "integer",
          "height": "integer",
          "format": "string"
        }
      }
    ],
    "structure": {  // Only if "structure" in extraction_targets
      "sections": [
        {
          "title": "string",
          "page": "integer",
          "level": "integer"
        }
      ]
    }
  },
  "metadata": {  // Only if "metadata" in extraction_targets
    "title": "string",
    "author": "string",
    "creation_date": "string",
    "file_size": "integer"
  },
  "provider": "string",
  "processing_time": "number"
}

Examples

# Simple text extraction
result = docsray.extract("document.pdf")

# Extract specific content types
result = docsray.extract("document.pdf", 
                        extraction_targets=["text", "tables", "images"])

# Extract specific pages as JSON
result = docsray.extract("document.pdf",
                        pages=[1, 2, 3],
                        output_format="json")

docsray_seek

Navigate to specific document locations.

Parameters

Parameter	Type	Required	Default	Description
`document_url`	string	Yes	-	Path or URL to document
`target`	object	Yes	-	Navigation target specification
`extract_content`	boolean	No	true	Extract content at target location
`context_size`	string	No	"medium"	Context amount around target
`provider`	string	No	"auto"	Provider selection

Target Object

One of these target types:

// Page navigation
{"page": "integer"}

// Section navigation  
{"section": "string"}

// Search navigation
{"query": "string"}

// Exact position
{"position": {"page": "integer", "offset": "integer"}}

Context Sizes

"small" - 200 characters around target
"medium" - 500 characters around target
"large" - 1000 characters around target
"page" - Entire page content

Response Schema

{
  "location": {
    "type": "string", // "page", "section", "query", "position"
    // For page navigation:
    "page": "integer",
    "total_pages": "integer",
    // For section navigation:
    "section_title": "string",
    "section_level": "integer", 
    "page_start": "integer",
    "page_end": "integer",
    // For query navigation:
    "query": "string",
    "total_matches": "integer",
    // Common:
    "position": {
      "page": "integer",
      "offset": "integer"
    }
  },
  "content": {  // Only if extract_content = true
    "text": "string",
    "markdown": "string",
    "word_count": "integer",
    "character_count": "integer"
  },
  "matches": [  // Only for query navigation
    {
      "page": "integer",
      "position": {"page": "integer", "offset": "integer"},
      "relevance_score": "number (0-1)",
      "context": "string",
      "section": "string"
    }
  ],
  "navigation": {
    "previous_page": "integer",
    "next_page": "integer",
    "section_title": "string",
    "parent_section": "string",
    "next_section": "string",
    "previous_section": "string"
  },
  "provider": "string"
}

Examples

# Navigate to specific page
result = docsray.seek("document.pdf", target={"page": 5})

# Find section
result = docsray.seek("document.pdf", target={"section": "Introduction"})

# Search for content
result = docsray.seek("document.pdf", 
                     target={"query": "financial performance"},
                     context_size="large")

Error Responses

All tools return error responses in this format when operations fail:

{
  "error": "string",      // Human-readable error message
  "type": "string",       // Error type (FileNotFoundError, etc.)
  "details": "object",    // Additional error details (optional)
  "suggestion": "string"  // Suggested solution (optional)
}

Common Parameters

Provider Options

"auto" - Automatic provider selection (default)
"pymupdf4llm" - Fast PyMuPDF4LLM provider
"llama-parse" - AI-powered LlamaParse provider

Document URL Formats

Local files: "./document.pdf", "/absolute/path/document.pdf"
URLs: "https://example.com/document.pdf"

Rate Limits

PyMuPDF4LLM: No limits (local processing)
LlamaParse: Based on your API plan
Auto provider: Inherits limits from selected provider

Response Caching

All tool responses are automatically cached based on:

Document content hash (not filename)
Operation parameters (provider, depth, instructions, etc.)
Operation type (peek, map, xray, extract, seek)

Cache keys are generated as: {operation}_{doc_hash}_{params_hash}

Next Steps

See Providers Overview for provider-specific details
Check Configuration API for environment settings
Review Tools Documentation for usage examples

Tools API Reference

docsray_peek​

Parameters​

Response Schema​

Examples​

docsray_map​

Parameters​

Response Schema​

Examples​

docsray_xray​

Parameters​

Analysis Types​

Response Schema​

Examples​

docsray_extract​

Parameters​

Extraction Targets​

Output Formats​

Response Schema​

Examples​

docsray_seek​

Parameters​

Target Object​

Context Sizes​

Response Schema​

Examples​

Error Responses​

Common Parameters​

Provider Options​

Document URL Formats​

Rate Limits​

Response Caching​

Next Steps​

docsray_peek

Parameters

Response Schema

Examples

docsray_map

Parameters

Response Schema

Examples

docsray_xray

Parameters

Analysis Types

Response Schema

Examples

docsray_extract

Parameters

Extraction Targets

Output Formats

Response Schema

Examples

docsray_seek

Parameters

Target Object

Context Sizes

Response Schema

Examples

Error Responses

Common Parameters

Provider Options

Document URL Formats

Rate Limits

Response Caching

Next Steps