Map Tool

Generate comprehensive document structure maps with detailed navigation information.

Overview

The docsray_map tool creates detailed structural maps of documents:

Complete document outline with hierarchical navigation
Page-by-page breakdown of content types
Section relationships and cross-references
Content distribution analysis across pages
Navigation metadata for efficient document traversal

Basic Usage

Generate Document Map

# Basic document structure map
result = docsray.map("document.pdf")

# Access the structural information
outline = result['structure']['outline']
sections = result['structure']['sections']
page_map = result['structure']['page_map']

Control Analysis Depth

# Basic structure only
result = docsray.map("document.pdf", analysis_depth="basic")

# Deep structural analysis
result = docsray.map("document.pdf", analysis_depth="deep")

# Comprehensive analysis with content
result = docsray.map("document.pdf", analysis_depth="comprehensive")

Parameters

document_url (required)

Path or URL to the document to map.

docsray.map("./reports/annual-report.pdf")
docsray.map("https://example.com/whitepaper.pdf")

include_content (optional)

Whether to include content snippets in the map. Default: false

# Structure only (faster)
docsray.map("doc.pdf", include_content=False)

# Structure with content samples
docsray.map("doc.pdf", include_content=True)

analysis_depth (optional)

Level of structural analysis. Default: "deep"

"basic" - Page structure and major headings
"deep" - Detailed hierarchy and section analysis
"comprehensive" - Full structure with relationships

provider (optional)

Provider for document processing. Default: "auto"

Response Structure

Basic Map Response

{
  "structure": {
    "outline": [
      {
        "title": "Executive Summary",
        "level": 1,
        "page": 1,
        "section_id": "exec_summary",
        "children": [
          {
            "title": "Key Highlights",
            "level": 2,
            "page": 1,
            "section_id": "highlights"
          }
        ]
      }
    ],
    "page_map": [
      {
        "page": 1,
        "sections": ["exec_summary", "highlights"],
        "content_types": ["text", "list"],
        "element_count": 12
      }
    ],
    "navigation": {
      "total_sections": 8,
      "max_depth": 3,
      "cross_references": [
        {"from": "page_1", "to": "page_5", "type": "see_also"}
      ]
    }
  },
  "metadata": {
    "total_pages": 25,
    "processing_time": 2.3,
    "analysis_depth": "deep"
  },
  "provider": "llama-parse"
}

Comprehensive Map with Content

{
  "structure": {
    "outline": [ /* ... outline structure ... */ ],
    "page_map": [ /* ... page mapping ... */ ],
    "sections": [
      {
        "id": "exec_summary",
        "title": "Executive Summary",
        "page_start": 1,
        "page_end": 2,
        "content_preview": "This quarter showed strong performance...",
        "content_length": 1250,
        "subsections": ["highlights", "challenges"]
      }
    ],
    "content_distribution": {
      "text_pages": 20,
      "table_pages": 8,
      "image_pages": 5,
      "mixed_pages": 12
    }
  }
}

Use Cases

Create interactive navigation for large documents:

def create_navigation(document_path):
    result = docsray.map(document_path, analysis_depth="comprehensive")
    
    navigation = []
    for item in result['structure']['outline']:
        navigation.append({
            "title": item['title'],
            "page": item['page'], 
            "level": item['level'],
            "children": item.get('children', [])
        })
    
    return navigation

# Generate table of contents
toc = create_navigation("manual.pdf")
for item in toc:
    indent = "  " * (item['level'] - 1)
    print(f"{indent}- {item['title']} (p. {item['page']})")

Content Analysis

Analyze document structure and content distribution:

def analyze_document_structure(document_path):
    result = docsray.map(document_path, include_content=True)
    structure = result['structure']
    
    # Analyze content distribution
    distribution = structure['content_distribution']
    total_pages = result['metadata']['total_pages']
    
    analysis = {
        "text_ratio": distribution['text_pages'] / total_pages,
        "table_ratio": distribution['table_pages'] / total_pages,
        "image_ratio": distribution['image_pages'] / total_pages,
        "section_count": len(structure['sections']),
        "avg_section_length": total_pages / len(structure['sections'])
    }
    
    return analysis

Quality Assessment

Assess document organization and structure quality:

def assess_document_quality(document_path):
    result = docsray.map(document_path, analysis_depth="comprehensive")
    structure = result['structure']
    
    # Check structural quality
    outline = structure['outline']
    navigation = structure['navigation']
    
    quality_score = 0
    feedback = []
    
    # Check hierarchical structure
    if navigation['max_depth'] >= 2:
        quality_score += 20
        feedback.append("Good hierarchical structure")
    else:
        feedback.append("Consider adding subsections")
    
    # Check section balance
    sections = structure['sections']
    lengths = [s['content_length'] for s in sections]
    if max(lengths) / min(lengths) < 3:  # Not too imbalanced
        quality_score += 20
        feedback.append("Well-balanced section lengths")
    
    return {
        "quality_score": quality_score,
        "feedback": feedback,
        "structure_depth": navigation['max_depth'],
        "section_count": len(sections)
    }

Performance Characteristics

Analysis Depth Performance

Depth	Typical Time	Memory Usage	Use Case
Basic	1-3s	Low	Quick structure overview
Deep	3-10s	Medium	Detailed navigation
Comprehensive	10-30s	High	Full analysis

Document Size Impact

Document Size	Basic	Deep	Comprehensive
Small (1-10 pages)	1-2s	2-4s	5-10s
Medium (10-50 pages)	2-5s	5-15s	15-45s
Large (50+ pages)	5-15s	15-60s	60-180s

Advanced Usage

Selective Section Mapping

# Map specific sections only
def map_sections(document_path, target_sections):
    full_map = docsray.map(document_path)
    outline = full_map['structure']['outline']
    
    filtered_sections = []
    for section in outline:
        if any(target in section['title'].lower() for target in target_sections):
            filtered_sections.append(section)
    
    return filtered_sections

# Find financial sections
financial_sections = map_sections("annual-report.pdf", 
                                 ["financial", "revenue", "earnings"])

Cross-Document Mapping

# Compare structure across multiple documents
def compare_document_structures(doc_paths):
    structures = {}
    
    for doc_path in doc_paths:
        result = docsray.map(doc_path, analysis_depth="basic")
        structures[doc_path] = {
            "sections": len(result['structure']['outline']),
            "depth": result['structure']['navigation']['max_depth'],
            "pages": result['metadata']['total_pages']
        }
    
    return structures

# Compare multiple reports
reports = ["q1-2023.pdf", "q2-2023.pdf", "q3-2023.pdf"]
comparison = compare_document_structures(reports)
for doc, stats in comparison.items():
    print(f"{doc}: {stats['sections']} sections, {stats['pages']} pages")

Error Handling

def safe_map_document(document_path):
    try:
        result = docsray.map(document_path)
        
        if "error" in result:
            return None, result["error"]
        
        return result, None
        
    except Exception as e:
        return None, f"Mapping failed: {str(e)}"

# Usage with error handling
map_result, error = safe_map_document("document.pdf")
if error:
    print(f"Cannot map document: {error}")
else:
    outline = map_result['structure']['outline']
    print(f"Document has {len(outline)} main sections")

Integration Patterns

Document Management System

class DocumentManager:
    def __init__(self):
        self.documents = {}
    
    def add_document(self, path):
        # Map document structure
        result = docsray.map(path, include_content=False)
        
        self.documents[path] = {
            "structure": result['structure'],
            "metadata": result['metadata'],
            "indexed_at": datetime.now()
        }
    
    def find_section(self, query):
        matches = []
        for doc_path, doc_data in self.documents.items():
            for section in doc_data['structure']['sections']:
                if query.lower() in section['title'].lower():
                    matches.append({
                        "document": doc_path,
                        "section": section['title'],
                        "page": section['page_start']
                    })
        return matches

Web API

from flask import Flask, jsonify

app = Flask(__name__)

@app.route('/api/document/<path:document_path>/map')
def get_document_map(document_path):
    depth = request.args.get('depth', 'deep')
    include_content = request.args.get('content', 'false').lower() == 'true'
    
    result = docsray.map(document_path, 
                        analysis_depth=depth,
                        include_content=include_content)
    
    if "error" in result:
        return jsonify({"error": result["error"]}), 400
    
    return jsonify(result)

Best Practices

Choose Appropriate Depth - Use basic for quick overview, comprehensive for detailed analysis
Cache Results - Map results are automatically cached for repeated access
Consider Document Size - Large documents may require longer processing times
Use Include Content Sparingly - Only include content when needed for analysis
Provider Selection - LlamaParse provides richer structure analysis than PyMuPDF4LLM

Next Steps

Learn about Xray Tool for AI-powered content analysis
See Seek Tool for navigation to specific sections
Check Extract Tool for content extraction based on structure
Review API Reference for complete parameter details

Map Tool

Overview​

Basic Usage​

Generate Document Map​

Control Analysis Depth​

Parameters​

document_url (required)​

include_content (optional)​

analysis_depth (optional)​

provider (optional)​

Response Structure​

Basic Map Response​

Comprehensive Map with Content​

Use Cases​

Document Navigation​

Content Analysis​

Quality Assessment​

Performance Characteristics​

Analysis Depth Performance​

Document Size Impact​

Advanced Usage​

Selective Section Mapping​

Cross-Document Mapping​

Error Handling​

Integration Patterns​

Document Management System​

Web API​

Best Practices​

Next Steps​