DocNav MCP Server

DocNav is a Model Context Protocol (MCP) server which empowers LLM Agents to read, analyze, and manage lengthy documents intelligently, mimicking human-like comprehension and navigation capabilities.

Features

Document Navigation: Navigate through document sections, headings, and content structure
Content Extraction: Extract and summarize specific document sections
Search & Query: Find specific content within documents using intelligent search
Multi-format Support: Currently supports Markdown (.md) files, with planned support for PDF and other formats
MCP Integration: Seamless integration with MCP-compatible LLMs and applications

Architecture

DocNav follows a modular, extensible architecture:

Core MCP Server: Main server implementation using the MCP protocol
Document Processors: Pluggable processors for different file types
Navigation Engine: Handles document structure analysis and navigation
Content Extractors: Extract and format content from documents
Search Engine: Provides search and query capabilities across documents

Installation

Prerequisites

Python 3.10+
uv package manager

Setup

1. Clone the repository:

bash

git clone https://github.com/shenyimings/DocNav-MCP.git
cd DocNav-MCP

2. Install dependencies:

bash

uv sync

Usage

Starting the MCP Server

bash

uv run server.py

Connect to the MCP server

json

{
  "mcpServers": {
    "docnav": {
      "command": "{{PATH_TO_UV}}", // Run `which uv` and place the output here
      "args": [
        "--directory",
        "{{PATH_TO_SRC}}",
        "run",
        "server.py"
      ]
    }
  }
}

Available Tools

load_document: Load a document for navigation and analysis
Args: file_path (path to document file)
Returns: Success message with auto-generated document ID

get_outline: Get document outline/table of contents
Args: doc_id (document identifier), max_depth (max heading depth, default 3)
Returns: Formatted document outline
Tip: Use first after loading a document to understand structure

read_section: Read content of a specific document section
Args: doc_id (document identifier), section_id (e.g., 'h1_0', 'h2_1')
Returns: Section content with subsections

search_document: Search for specific content within a document
Args: doc_id (document identifier), query (search term or phrase)
Returns: Formatted search results with context

navigate_section: Get navigation context for a section
Args: doc_id (document identifier), section_id (section to navigate to)
Returns: Navigation context with parent, siblings, children

list_documents: List all currently loaded documents
Returns: List of loaded documents with metadata

get_document_stats: Get statistics about a loaded document
Args: doc_id (document identifier)
Returns: Document statistics and structure info

remove_document: Remove a document from the navigator
Args: doc_id (document identifier)
Returns: Success or error message

Example Usage

python

# Load a document
result = await tools.load_document("path/to/document.md")

# Get document outline
outline = await tools.get_outline(doc_id)

# Get specific section content
section = await tools.read_section(doc_id, section_id)

# Search within document
results = await tools.search_document(doc_id, "search query")

Development

Project Structure

code

docnav-mcp/
--- server.py             # Main MCP server
--- docnav/
------- __init__.py           # Package initialization
------- models.py             # Data models
------- navigator.py          # Document navigation engine
------- processors/
------- __init__.py       # Processor package
------- base.py           # Base processor interface
------- markdown.py       # Markdown processor
--- tests/
------- ...                   # Test files

Development Guidelines

See CLAUDE.md for detailed development guidelines including:

Code quality standards
Testing requirements
Package management with uv
Formatting and linting rules

Adding New Document Processors

1. Create a new processor class inheriting from BaseProcessor

2. Implement the required methods: can_process, process, extract_section, search

3. Register the processor in the DocumentNavigator

4. Add comprehensive tests

Running Tests

bash

# Run all tests
uv run tests/run_tests.py

Code Quality

bash

# Format code
uv run --frozen ruff format .

# Check linting
uv run --frozen ruff check .

# Type checking
uv run --frozen pyright

Roadmap

[x] Complete Markdown processor implementation
[x] Add PDF document support (PyMuPDF)
[x] Improve test coverage and quality
[ ] Implement advanced search capabilities
[ ] Add document summarization features
[ ] Support for additional document formats (DOCX, TXT, etc.)
[ ] Performance optimizations for large documents
[ ] Caching mechanisms for frequently accessed documents
[ ] Add persistent storage for loaded documents

Contributing

1. Fork the repository

2. Create a feature branch

3. Follow the development guidelines in CLAUDE.md

4. Add tests for new functionality

5. Submit a pull request

License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

Support

For issues and questions:

Open an issue on GitHub
Check the documentation in CLAUDE.md
Review existing issues and discussions

Docnav Mcp

Documentation

DocNav MCP Server

Features

Architecture

Installation

Prerequisites

Setup

Usage

Starting the MCP Server

Connect to the MCP server

Available Tools

Example Usage

Development

Project Structure

Development Guidelines

Adding New Document Processors

Running Tests

Code Quality

Roadmap

Contributing

License

Support

Similar MCP

Biomcp

Kill Process Mcp

Fal Mcp Server

Davinci Resolve Mcp

Trending MCP

Playwright Mcp

Serena

Mcp Playwright

Mcp Server Cloudflare

Biomcp

Kill Process Mcp

Fal Mcp Server

Davinci Resolve Mcp

Playwright Mcp

Serena

Mcp Playwright

Mcp Server Cloudflare