Track MCP LogoTrack MCP
Track MCP LogoTrack MCP

The world's largest repository of Model Context Protocol servers. Discover, explore, and submit MCP tools.

Product

  • Categories
  • Top MCP
  • New & Updated
  • Submit MCP

Company

  • About

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie Policy

© 2026 TrackMCP. All rights reserved.

Built with ❤️ by Krishna Goyal

    Mcp Documentation Server

    Bridge the AI Knowledge Gap. ✨ Features: Document management • Gemini integration • AI-powered semantic search • File uploads • Smart chunking • Multilingua...

    226 stars
    TypeScript
    Updated Oct 18, 2025
    documents
    gemini
    knowledge-base
    mcp-server
    model-context-protocol

    Table of Contents

    • Core capabilities
    • 🌐 Web UI
    • 🔍 Search & Intelligence
    • ⚡ Performance & Architecture
    • 📁 File Management
    • Quick Start
    • Configure an MCP client
    • Quick Start
    • Web UI
    • Basic workflow
    • MCP Tools
    • 📄 Document Management
    • 📁 File Processing
    • 🔍 Search
    • Configuration
    • Storage layout
    • Embedding Models
    • Architecture
    • Development
    • Contributing
    • License
    • Support
    • Star History

    Table of Contents

    • Core capabilities
    • 🌐 Web UI
    • 🔍 Search & Intelligence
    • ⚡ Performance & Architecture
    • 📁 File Management
    • Quick Start
    • Configure an MCP client
    • Quick Start
    • Web UI
    • Basic workflow
    • MCP Tools
    • 📄 Document Management
    • 📁 File Processing
    • 🔍 Search
    • Configuration
    • Storage layout
    • Embedding Models
    • Architecture
    • Development
    • Contributing
    • License
    • Support
    • Star History

    Documentation

    Verified on MseeP npm version Ask DeepWiki License: MIT

    Donate with PayPal

    "Buy Me A Coffee"

    MCP Documentation Server

    A TypeScript-based Model Context Protocol (MCP) server that provides local-first document management and semantic search. Documents are stored in an embedded Orama vector database with hybrid search (full-text + vector), intelligent chunking, and local AI embeddings — no external database or cloud service required.

    Core capabilities

    🌐 Web UI

    • Built-in Web Interface: A full-featured web dashboard starts automatically alongside the MCP server — no additional setup required
    • Complete Tool Coverage: Every MCP tool is accessible from the browser: add/view/delete documents, semantic search, AI search, file uploads, and context window exploration
    • Drag & Drop Uploads: Upload .txt, .md, and .pdf files directly from the browser
    • Configurable: Disable with START_WEB_UI=false or change the port with WEB_PORT

    🔍 Search & Intelligence

    • Hybrid Search: Combined full-text and vector similarity powered by Orama, for both single-document and cross-document queries
    • AI-Powered Search 🤖: Advanced document analysis with Google Gemini AI for contextual understanding and intelligent insights (optional, requires API key)
    • Context Window Retrieval: Fetch surrounding chunks to provide LLMs with richer context

    ⚡ Performance & Architecture

    • Orama Vector DB: Embedded search engine with zero native dependencies — replaces manual JSON storage and cosine similarity
    • LRU Embedding Cache: Avoids recomputing embeddings for repeated content and queries
    • Parent-Child Chunking: Documents are split into large context-preserving parent chunks and small precise child chunks for vector search — search results include both the matched fragment and the full surrounding context
    • Streaming File Reader: Handles large files without high memory usage
    • Automatic Migration: Legacy JSON documents are migrated to Orama on first startup — no manual intervention needed

    📁 File Management

    • Upload processing: Drop .txt, .md, or .pdf files into the uploads folder and process them with a single tool call
    • Copy-based storage: Original files are backed up alongside the database
    • Local-only storage: All data resides in ~/.mcp-documentation-server/

    Quick Start

    Configure an MCP client

    Example configuration for an MCP client (e.g., Claude Desktop, VS Code):

    Quick Start

    json
    {
      "mcpServers": {
        "documentation": {
          "command": "npx",
          "args": [
            "-y",
            "@andrea9293/mcp-documentation-server"
          ]
        }
      }
    }

    Advanced with env vars (all vars are optional)

    json
    {
      "mcpServers": {
        "documentation": {
          "command": "npx",
          "args": [
            "-y",
            "@andrea9293/mcp-documentation-server"
          ],
          "env": {
            "MCP_BASE_DIR": "/path/to/workspace",
            "GEMINI_API_KEY": "your-api-key-here",
            "MCP_EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2",
            "START_WEB_UI": "true",
            "WEB_PORT": "3080",
          }
        }
      }
    }

    All environment variables are optional. Without GEMINI_API_KEY, only the local embedding-based search tools are available.

    Web UI

    The web interface starts automatically on port 3080 when the MCP server launches. Open your browser at:

    code
    http://localhost:3080

    From the web UI you can:

    • 📊 Dashboard — overview of all documents and stats
    • 📄 Documents — browse, view, and delete documents
    • ➕ Add Document — create documents with title, content, and metadata
    • 🔍 Search All — semantic search across all documents
    • 🎯 Search in Doc — search within a specific document
    • 🤖 AI Search — Gemini-powered analysis (if GEMINI_API_KEY is set)
    • 📁 Upload Files — drag & drop files and process them into the knowledge base
    • 🪟 Context Window — explore chunks around a specific index

    To run the web UI standalone (without the MCP server):

    bash
    npm run web          # Development (tsx)
    npm run web:build    # Production (compiled)

    Basic workflow

    1. Add documents using add_document or place .txt / .md / .pdf files in the uploads folder and call process_uploads.

    2. Search across everything with search_all_documents, or within a single document with search_documents.

    3. Use get_context_window to fetch neighboring chunks and give the LLM broader context.

    MCP Tools

    The server registers the following tools (all validated with Zod schemas):

    📄 Document Management

    ToolDescription
    add_documentAdd a document (title, content, optional metadata)
    list_documentsList all documents with metadata and content preview
    get_documentRetrieve the full content of a document by ID
    delete_documentRemove a document, its chunks, database entries, and associated files

    📁 File Processing

    ToolDescription
    process_uploadsProcess all files in the uploads folder (chunking + embeddings)
    get_uploads_pathReturns the absolute path to the uploads folder
    list_uploads_filesLists files in the uploads folder with size and format info
    get_ui_urlReturns the Web UI URL (e.g. http://localhost:3080) — useful to open the dashboard or to locate the uploads folder from the browser

    🔍 Search

    ToolDescription
    search_documentsSemantic vector search within a specific document
    search_all_documentsHybrid (full-text + vector) cross-document search
    get_context_windowReturns a window of chunks around a given chunk index
    search_documents_with_ai🤖 AI-powered search using Gemini (requires GEMINI_API_KEY)

    Configuration

    Configure via environment variables or a .env file in the project root:

    VariableDefaultDescription
    MCP_BASE_DIR~/.mcp-documentation-serverBase directory for data storage
    MCP_EMBEDDING_MODELXenova/all-MiniLM-L6-v2Embedding model name
    GEMINI_API_KEY—Google Gemini API key (enables search_documents_with_ai)
    MCP_CACHE_ENABLEDtrueEnable/disable LRU embedding cache
    START_WEB_UItrueSet to false to disable the built-in web interface
    WEB_PORT3080Port for the web UI
    MCP_STREAMING_ENABLEDtrueEnable streaming reads for large files
    MCP_STREAM_CHUNK_SIZE65536Streaming buffer size in bytes (64KB)
    MCP_STREAM_FILE_SIZE_LIMIT10485760Threshold to switch to streaming (10MB)

    Storage layout

    code
    ~/.mcp-documentation-server/     # Or custom path via MCP_BASE_DIR
    ├── data/
    │   ├── orama-chunks.msp         # Orama vector DB (child chunks + embeddings)
    │   ├── orama-docs.msp           # Orama document DB (full content + metadata)
    │   ├── orama-parents.msp        # Orama parent chunks DB (context sections)
    │   ├── migration-complete.flag   # Written after legacy JSON migration
    │   └── *.md                     # Markdown copies of documents
    └── uploads/                     # Drop .txt, .md, .pdf files here

    Embedding Models

    Set via MCP_EMBEDDING_MODEL:

    ModelDimensionsNotes
    Xenova/all-MiniLM-L6-v2384Default — fast, good quality
    Xenova/paraphrase-multilingual-mpnet-base-v2768Recommended — best quality, multilingual

    Models are downloaded on first use (~80–420 MB). The vector dimension is determined automatically from the provider.

    ⚠️ Important: Changing the embedding model requires re-adding all documents — embeddings from different models are incompatible. The Orama database is recreated automatically when the dimension changes.

    Architecture

    code
    Server (FastMCP, stdio)
      ├─ Web UI (Express, port 3080)
      │    └─ REST API → DocumentManager
      └─ MCP Tools
           └─ DocumentManager
                ├─ OramaStore          — Orama vector DB (chunks DB + docs DB + parents DB), persistence, migration
                ├─ IntelligentChunker  — Parent-child chunking (code, markdown, text, PDF)
                ├─ EmbeddingProvider   — Local embeddings via @xenova/transformers
                │    └─ EmbeddingCache — LRU in-memory cache
                └─ GeminiSearchService — Optional AI search via Google Gemini
    • OramaStore manages three Orama instances: one for document metadata/content, one for child chunks with vector embeddings, and one for parent chunks (context sections). All are persisted to binary files on disk and restored on startup.
    • IntelligentChunker implements the Parent-Child Chunking pattern: documents are first split into large parent chunks that preserve full context (sections, paragraphs), then each parent is further split into small child chunks for precise vector search. At query time, results are deduplicated by parent so that the LLM receives both the matched fragment and the broader context.
    • EmbeddingProvider lazily loads a Transformers.js model for local inference — no API calls needed.

    Development

    bash
    git clone https://github.com/andrea9293/mcp-documentation-server.git
    cd mcp-documentation-server
    npm install
    bash
    npm run dev       # FastMCP dev mode with hot reload
    npm run build     # TypeScript compilation
    npm run inspect   # FastMCP web UI for interactive tool testing
    npm start         # Direct tsx execution (MCP server + web UI)
    npm run web       # Run only the web UI (development)
    npm run web:build # Run only the web UI (compiled)

    Contributing

    1. Fork the repository

    2. Create a feature branch: git checkout -b feature/name

    3. Follow Conventional Commits for messages

    4. Open a pull request

    License

    MIT — see LICENSE

    Support

    • 📖 Documentation
    • 🐛 Report Issues
    • 💬 MCP Community
    • 🤖 Google AI Studio — get a Gemini API key

    ---

    Star History

    Star History Chart

    **Built with FastMCP, Orama, and TypeScript**

    Similar MCP

    Based on tags & features

    • ME

      Metmuseum Mcp

      TypeScript·
      14
    • MC

      Mcp Ipfs

      TypeScript·
      11
    • MC

      Mcp Open Library

      TypeScript·
      42
    • AN

      Anilist Mcp

      TypeScript·
      57

    Trending MCP

    Most active this week

    • PL

      Playwright Mcp

      TypeScript·
      22.1k
    • SE

      Serena

      Python·
      14.5k
    • MC

      Mcp Playwright

      TypeScript·
      4.9k
    • MC

      Mcp Server Cloudflare

      TypeScript·
      3.0k
    View All MCP Servers

    Similar MCP

    Based on tags & features

    • ME

      Metmuseum Mcp

      TypeScript·
      14
    • MC

      Mcp Ipfs

      TypeScript·
      11
    • MC

      Mcp Open Library

      TypeScript·
      42
    • AN

      Anilist Mcp

      TypeScript·
      57

    Trending MCP

    Most active this week

    • PL

      Playwright Mcp

      TypeScript·
      22.1k
    • SE

      Serena

      Python·
      14.5k
    • MC

      Mcp Playwright

      TypeScript·
      4.9k
    • MC

      Mcp Server Cloudflare

      TypeScript·
      3.0k