MCP Hub
Back to servers

MetaSearchMCP

Open-source metasearch backend, MCP server, and AI search API for LLM agents. Python FastAPI search gateway with Google search via SerpBase and Serper, multi-engine search aggregation, structured JSON output, provider fallback, deduplication, and SearXNG alternative architecture for agent workflows.

GitHub
Stars
25
Forks
1
Updated
Apr 20, 2026
Validated
Apr 22, 2026

MetaSearchMCP

Open-source metasearch backend for MCP, AI agents, and LLM workflows.

MetaSearchMCP aggregates results from multiple search providers, normalizes them into a stable JSON schema, and exposes both an HTTP API and an MCP server for agent tooling.

Positioning

  • MCP-first metasearch backend
  • Structured search API for AI pipelines
  • Multi-provider search orchestration with deduplication and fallback
  • Python FastAPI alternative to browser-first metasearch projects

Why It Exists

Most search aggregators are designed around browser UX: HTML pages, pagination, and interactive result cards. Agents and LLM workflows need a different contract: predictable JSON, stable field names, partial-failure tolerance, and provider-level execution metadata.

MetaSearchMCP is built for that machine-consumable workflow. It is not a SearXNG clone. The design is centered on search orchestration, normalized contracts, and MCP integration.

Core Features

  • Concurrent multi-provider aggregation
  • Unified result schema for web, academic, developer, and knowledge sources
  • Provider-level timeout isolation and partial-failure handling
  • Result deduplication across engines
  • Provider selection by explicit names or semantic tags such as web, academic, code, and google
  • Final result caps for agent-friendly payload sizing
  • HTTP API with OpenAPI docs
  • MCP server over stdio for Claude Desktop, Cline, Continue, and similar clients
  • Configurable provider allowlist via environment variables

Google Support

Google is intentionally not scraped directly in this project.

In practice, Google's anti-bot and risk-control systems make self-hosted scraping brittle and expensive to maintain. For a backend intended for reliable MCP and AI workloads, hosted Google providers are the more practical option.

Currently supported Google providers:

ProviderEnv varNotes
serpbase.devSERPBASE_API_KEYPay-per-use; typically cheaper for low-volume usage
serper.devSERPER_API_KEYIncludes a free tier, then pay-per-use

Both are low-cost options. For smaller or occasional workloads, serpbase.dev is usually the lower-cost choice.

Supported Providers

Google

ProviderNameMethod
SerpBasegoogle_serpbaseHosted Google SERP API
Serpergoogle_serperHosted Google SERP API

Web Search

ProviderNameMethod
DuckDuckGoduckduckgoHTML scraping
BingbingRSS feed
YahooyahooHTML scraping, best effort
BravebraveOfficial Search API
MwmblmwmblPublic JSON API
EcosiaecosiaHTML scraping
MojeekmojeekHTML scraping
StartpagestartpageHTML scraping, best effort
QwantqwantInternal JSON API, best effort
YandexyandexHTML scraping, best effort
BaidubaiduJSON endpoint, best effort

Knowledge And Reference

ProviderNameMethod
WikipediawikipediaMediaWiki API
WikidatawikidataWikidata API
Internet Archiveinternet_archiveAdvanced Search API
Open LibraryopenlibraryOpen Library search API

Developer Sources

ProviderNameMethod
GitHubgithubGitHub REST API
GitLabgitlabGitLab REST API
Stack OverflowstackoverflowStack Exchange API
Hacker NewshackernewsAlgolia HN API
RedditredditReddit API
npmnpmnpm registry API
PyPIpypiHTML scraping
RubyGemsrubygemsRubyGems search API
crates.iocratescrates.io API
lib.rslib_rsHTML scraping
Docker HubdockerhubDocker Hub search API
pkg.go.devpkg_go_devHTML scraping
MetaCPANmetacpanMetaCPAN REST API

Academic Sources

ProviderNameMethod
arXivarxivAtom API
PubMedpubmedNCBI E-utilities
Semantic ScholarsemanticscholarGraph API
CrossRefcrossrefREST API

Finance Sources

ProviderNameKey RequiredFree Tier
Yahoo Financeyahoo_financeNoUnofficial endpoint, no key needed
Alpha Vantagealpha_vantageALPHA_VANTAGE_API_KEY25 req/day — get key
FinnhubfinnhubFINNHUB_API_KEY60 req/min — get key

Installation

One-command local install:

python scripts/install.py

Install, run tests, and start the HTTP API:

python scripts/install.py --dev --test --run

Deploy with Docker Compose:

python scripts/install.py --mode docker

The installer creates .env from .env.example when .env does not already exist. Existing .env files are kept unless --force-env is passed.

Manual install:

git clone https://github.com/gefsikatsinelou/MetaSearchMCP
cd MetaSearchMCP
pip install -e ".[dev]"

Or with uv:

uv pip install -e ".[dev]"

Configuration

Copy .env.example to .env and configure any providers you want to enable.

cp .env.example .env

Key settings:

HOST=0.0.0.0
PORT=8000
DEFAULT_TIMEOUT=10
AGGREGATOR_TIMEOUT=15

SERPBASE_API_KEY=
SERPER_API_KEY=
BRAVE_API_KEY=
GITHUB_TOKEN=
STACKEXCHANGE_API_KEY=
REDDIT_CLIENT_ID=
REDDIT_CLIENT_SECRET=
NCBI_API_KEY=
SEMANTIC_SCHOLAR_API_KEY=
ALPHA_VANTAGE_API_KEY=
FINNHUB_API_KEY=

ENABLED_PROVIDERS=
ALLOW_UNSTABLE_PROVIDERS=false
MAX_RESULTS_PER_PROVIDER=10

Running

HTTP API

python -m metasearchmcp.server
# or
metasearchmcp

The API starts on http://localhost:8000.

MCP Server

python -m metasearchmcp.broker
# or
metasearchmcp-mcp

The MCP server communicates over stdio.

Docker

docker build -t metasearchmcp .
docker run --rm -p 8000:8000 --env-file .env metasearchmcp

Or with Compose:

docker compose up --build

HTTP API

POST /search

Aggregate across all enabled providers or a selected provider subset.

curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "rust async runtime",
    "providers": ["duckduckgo", "wikipedia"],
    "params": {"num_results": 5, "max_total_results": 8, "language": "en"}
  }'

You can also narrow providers by tags:

curl -X POST http://localhost:8000/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "transformer attention",
    "tags": ["academic", "knowledge"],
    "params": {"num_results": 5, "max_total_results": 6}
  }'

num_results controls how many results each provider can contribute. max_total_results caps the final merged response after deduplication.

POST /search/google

Search Google through a configured hosted provider.

curl -X POST http://localhost:8000/search/google \
  -H "Content-Type: application/json" \
  -d '{"query": "site:github.com rust tokio"}'

GET /providers

Return the currently available provider catalog.

The response includes provider descriptions and a tag-to-provider index for quick discovery.

You can filter the catalog by tag:

curl "http://localhost:8000/providers?tag=academic&tag=web"

GET /health

Simple health check endpoint. Returns service status, version, provider count, and the current provider name list.

Response Schema

Every aggregated response includes:

  • engine
  • query
  • results
  • related_searches
  • suggestions
  • answer_box
  • timing_ms
  • providers
  • errors

Every result item includes:

  • title
  • url
  • snippet
  • source
  • rank
  • provider
  • published_date
  • extra

Example response:

{
  "engine": "metasearchmcp",
  "query": "rust async runtime",
  "results": [
    {
      "title": "Tokio - An asynchronous Rust runtime",
      "url": "https://tokio.rs",
      "snippet": "Tokio is an event-driven, non-blocking I/O platform...",
      "source": "tokio.rs",
      "rank": 1,
      "provider": "duckduckgo",
      "published_date": null,
      "extra": {}
    }
  ],
  "related_searches": [],
  "suggestions": [],
  "answer_box": null,
  "timing_ms": 843.2,
  "providers": [
    {
      "name": "duckduckgo",
      "success": true,
      "result_count": 10,
      "latency_ms": 840.1,
      "error": null
    }
  ],
  "errors": []
}

MCP Tools

MetaSearchMCP exposes these MCP tools:

  • search_web
  • search_google
  • search_academic
  • search_github
  • compare_engines

search_web also accepts optional tags so agents can limit search to categories such as web, academic, code, or google. All search tools accept max_total_results to keep the final payload compact.

Example Claude Desktop config:

{
  "mcpServers": {
    "MetaSearchMCP": {
      "command": "metasearchmcp-mcp",
      "env": {
        "SERPBASE_API_KEY": "your_key",
        "SERPER_API_KEY": "your_key"
      }
    }
  }
}

Development

pip install -e ".[dev]"
pytest
uvicorn metasearchmcp.server:app --reload

Architecture

The public package is organized around these modules:

  • contracts.py: request and response models
  • catalog.py: provider discovery and selection
  • orchestrator.py: concurrent search execution and response assembly
  • merge.py: URL normalization and deduplication
  • server.py: FastAPI entrypoint
  • broker.py: MCP entrypoint

Legacy module names are kept as compatibility shims for earlier imports.

Roadmap

  • Caching and provider-aware query reuse
  • Better scoring and ranking signals across providers
  • Streaming aggregation responses
  • Provider health telemetry
  • More first-party API integrations where they improve reliability

License

MIT

Reviews

No reviews yet

Sign in to write a review