MCP Hub
Back to servers

qurio

Self-hosted RAG engine for AI coding assistants. Ingests technical docs & code repositories locally with structure-aware chunking. Serves grounded context via MCP to prevent hallucinations in software development workflows.

Stars
3
Forks
1
Updated
Jan 21, 2026
Validated
Feb 9, 2026
Qurio Logo

Go Vue Python Docker MCP License

codecov Semgrep CodeQL Dependabot

The Open Source Knowledge Engine for AI Agents
Built for localhost. Grounded in truth.


📖 About

Qurio is a self-hosted, open-source ingestion and retrieval engine that functions as a local Shared Library for AI coding assistants (like Gemini-CLI, Claude Code, Cursor, Windsurf, or custom scripts).

Unlike cloud-based RAG solutions that introduce latency and privacy risks, Qurio runs locally to ingest your handpicked heterogeneous documentation (web crawls, PDFs, Markdown) and serves it directly to your agents via the Model Context Protocol (MCP). This ensures your AI writes better code faster using only the context you trust.

Qurio features a custom structural chunker that respects code blocks, API definitions, and config files, preserving full code blocks and syntaxes.

Why Qurio?

  • Privacy First: Your data stays on your machine (localhost).
  • Precision: Retrieves grounded "truth" to prevent AI hallucinations.
  • Speed: Deploys in minutes with docker-compose.
  • Open Standards: Built on MCP, Weaviate, and PostgreSQL.

✨ Key Features

  • 🌐 Universal Ingestion: Crawl documentation sites or upload files (PDF, DOCX, MD).
  • 🧠 Hybrid Search: Configurable BM25 keyword search with Vector embeddings for high-recall retrieval.
  • 🎯 Configurable Reranking: Integrate Jina AI or Cohere for precision tuning.
  • 🔌 Native MCP Support: Exposes a standard JSON-RPC 2.0 endpoint for seamless integration with AI coding assistants.
  • 🕸️ Smart Crawling: Recursive web crawling with depth control, regex exclusions, respect robot.txt, sitemap and llms.txt llms-full.txt support.
  • 📄 OCR Pipeline: Automatically extracts text from scanned PDFs and images via Docling.
  • 🖥️ Admin Dashboard: Manage sources, view ingestion status, and debug queries via a clean Vue.js interface.

🏗️ Architecture

Qurio is built as a set of microservices orchestrated by Docker Compose:

  • Backend (Go): Core orchestration, API, and MCP server.
  • Frontend (Vue.js): User interface for managing sources and settings.
  • Ingestion Worker (Python): Async ingestion engine handling crawling (crawl4ai) and parsing (docling).
  • Vector Store (Weaviate): Stores embeddings and handles hybrid search.
  • Database (PostgreSQL): Stores metadata, job status, and configuration.
  • Queue (NSQ): Manages asynchronous ingestion tasks.

🚀 Getting Started

Prerequisites

Installation

  1. Clone the repository:

    git clone https://github.com/irahardianto/qurio.git
    cd qurio
    
  2. Configure Environment: Copy the example environment file and add your API key.

    cp .env.example .env
    
  3. Start the System:

    docker-compose up -d
    

    Wait a minute for all services (Weaviate, Postgres) to initialize.

  4. Access the Dashboard: Open http://localhost:3000 in your browser.

  5. Add API Keys: Access http://localhost:3000/settings page in the dashboard, and add your Gemini and JinaAI/Cohoere(optional) API Keys

Configuration

Configuration is managed via the Settings page in the UI or environment variables.

VariableDescriptionDefault
GEMINI_API_KEYKey for Google Gemini (Embeddings)Required
RERANK_PROVIDERnone, jina, coherenone
RERANK_API_KEYAPI Key for selected provider-
SEARCH_ALPHAHybrid search balance (0.0=Keyword, 1.0=Vector)0.5
SEARCH_TOP_KMax results to return5

💡 Usage

[!TIP] Unlock the full potential of your Agent
Check out the Agent Prompting Guide for best practices, workflow examples, and system prompt templates (CLAUDE.md, GEMINI.md) to paste into your project.

1. Add Data Sources

Navigate to the Admin Dashboard (http://localhost:3000) and click "Add Source".

  • Web Crawl: Enter a documentation URL (e.g., https://docs.docker.com). Configure depth and exclusion patterns.
  • File Upload: Drag and drop PDFs or Markdown files.

2. Connect Your AI Agent (MCP)

Configure your MCP-enabled editor (like Cursor/Gemini CLI) to connect to Qurio.

Add the following to your MCP settings:

{
  "mcpServers": {
    "qurio": {
      "httpUrl": "http://localhost:8081/mcp"
    }
  }
}

Note: Qurio uses a stateless, streamable HTTP transport at http://localhost:8081/mcp. Use a client that supports native HTTP MCP connections.

3. Query

Ask your AI agent a question. It will now have access to the documentation you indexed!

"How do I configure a healthcheck in Docker Compose?"

4. Available Tools

Once connected, your agent will have access to the following tools:

ToolDescription
qurio_searchSearch your knowledge base. Supports hybrid search (keywords + vectors). Use this to find relevant documentation or code examples.
qurio_list_sourcesList all available data sources. Useful to see what documentation is currently indexed.
qurio_list_pagesList pages within a source. Helpful for exploring the structure of a documentation site.
qurio_read_pageRead a full page. Retrieves the complete content of a specific document or web page found via search or listing.

5. Roadmap

  • Rework crawler & embedder parallelization
  • Migrate to Streamable HTTP
  • Supports multiple different models beyond Gemini
  • Supports more granular i.e. section by section page retrieval

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Built with ❤️ for the Developer Community

Reviews

No reviews yet

Sign in to write a review