MCP Hub
Back to servers

Overwatch MCP

An MCP server that enables querying logs and metrics from Graylog, Prometheus, and InfluxDB 2.x. It provides tools for executing Lucene log searches, PromQL queries, and Flux queries directly within MCP-compatible clients.

Updated
Jan 28, 2026

Overwatch MCP

Python 3.11+ License: MIT Docker CI

MCP server for querying Graylog, Prometheus, and InfluxDB 2.x from Claude Desktop.

Tools

ToolWhat it does
graylog_searchSearch logs (Lucene syntax)
graylog_fieldsList log fields
prometheus_queryInstant PromQL query
prometheus_query_rangeRange PromQL query
prometheus_metricsList metrics
influxdb_queryFlux query (bucket allowlisted)

Quick Start

One-Line Setup (Docker)

curl -fsSL https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/setup.sh | bash
cd Overwatch_MCP
# Edit .env and config.yaml with your values
docker compose up -d

Manual Setup (Docker)

# Download compose files
mkdir -p Overwatch_MCP && cd Overwatch_MCP
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/docker-compose.yml
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/.env.example
curl -fsSLO https://raw.githubusercontent.com/malindarathnayake/Overwatch-mcp/main/compose/config.example.yaml

# Create config from templates
cp .env.example .env
cp config.example.yaml config.yaml

# Edit .env with your credentials
# Edit config.yaml if needed (adjust allowed_buckets, limits, etc.)

# Run
docker compose up -d

Local Install

pip install -e .
cp .env.example .env
cp config/config.example.yaml config/config.yaml
# Edit both files with your values
python -m overwatch_mcp

Claude Desktop Config

Docker

~/.claude/config.json (Linux/Mac) or %APPDATA%\Claude\config.json (Windows):

{
  "mcpServers": {
    "overwatch": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-v", "/path/to/config:/app/config:ro",
        "--env-file", "/path/to/.env",
        "ghcr.io/malindarathnayake/Overwatch-mcp:latest"
      ]
    }
  }
}

Local Python

{
  "mcpServers": {
    "overwatch": {
      "command": "python",
      "args": ["-m", "overwatch_mcp"],
      "env": {
        "GRAYLOG_URL": "https://graylog.internal:9000/api",
        "GRAYLOG_TOKEN": "your-token",
        "PROMETHEUS_URL": "http://prometheus.internal:9090",
        "INFLUXDB_URL": "https://influxdb.internal:8086",
        "INFLUXDB_TOKEN": "your-token",
        "INFLUXDB_ORG": "your-org"
      }
    }
  }
}

Windows PowerShell Setup

One-shot script to configure Claude Desktop on Windows:

# Stop Claude if running
Get-Process -Name "Claude*" -ErrorAction SilentlyContinue | Stop-Process -Force

$config = @'
{
  "mcpServers": {
    "overwatch": {
      "command": "C:/Users/<USERNAME>/AppData/Local/Microsoft/WindowsApps/python3.13.exe",
      "args": ["-m", "overwatch_mcp", "--config", "C:/path/to/Overwatch-mcp/compose/config.yaml"],
      "env": {
        "GRAYLOG_URL": "https://your-graylog-url",
        "GRAYLOG_TOKEN": "<YOUR_GRAYLOG_TOKEN>",
        "PROMETHEUS_URL": "http://your-prometheus-url:9090",
        "INFLUXDB_URL": "https://your-influxdb-url",
        "INFLUXDB_TOKEN": "<YOUR_INFLUXDB_TOKEN>",
        "INFLUXDB_ORG": "<YOUR_INFLUXDB_ORG>",
        "LOG_LEVEL": "debug",
        "LOG_FILE": "C:/path/to/Overwatch-mcp/overwatch.log"
      }
    }
  }
}
'@
[System.IO.File]::WriteAllText("$env:APPDATA\Claude\claude_desktop_config.json", $config)

# Install from source (run from repo root)
cd C:\path\to\Overwatch-mcp
pip install -e .

Note: Replace <USERNAME>, <YOUR_GRAYLOG_TOKEN>, <YOUR_INFLUXDB_TOKEN>, <YOUR_INFLUXDB_ORG>, and paths with your actual values.

Configuration

config.yaml

The config uses ${ENV_VAR} substitution - values come from environment at runtime.

server:
  log_level: "info"

datasources:
  graylog:
    enabled: true
    url: "${GRAYLOG_URL}"
    token: "${GRAYLOG_TOKEN}"
    timeout_seconds: 30
    max_time_range_hours: 24
    max_results: 1000
    # Production environments to filter on (auto-builds from known_applications.json)
    production_environments:
      - "prod"
      - "production"
    # Known apps file - auto-builds env filter from discovered data
    known_applications_file: "${GRAYLOG_KNOWN_APPS_FILE:-}"

  prometheus:
    enabled: true
    url: "${PROMETHEUS_URL}"
    timeout_seconds: 30
    max_range_hours: 168

  influxdb:
    enabled: true
    url: "${INFLUXDB_URL}"
    token: "${INFLUXDB_TOKEN}"
    org: "${INFLUXDB_ORG}"
    timeout_seconds: 60
    allowed_buckets:
      - "telegraf"
      - "app_metrics"

cache:
  enabled: true
  default_ttl_seconds: 60

Disable a datasource by setting enabled: false. Server runs in degraded mode if some datasources fail health checks.

Tool Parameters

graylog_search

{
  "query": "level:ERROR AND service:api",
  "from_time": "-2h",
  "to_time": "now",
  "limit": 100,
  "fields": ["timestamp", "message", "level"]
}

Time formats: ISO8601 (2025-01-27T10:00:00Z), relative (-1h, -30m), now

graylog_fields

{
  "pattern": "http_.*",
  "limit": 100
}

prometheus_query

{
  "query": "rate(http_requests_total[5m])",
  "time": "-1h"
}

prometheus_query_range

{
  "query": "up",
  "start": "-6h",
  "end": "now",
  "step": "1m"
}

Step auto-calculated if omitted.

prometheus_metrics

{
  "pattern": "http_.*",
  "limit": 100
}

influxdb_query

{
  "query": "from(bucket: \"telegraf\") |> range(start: -1h) |> filter(fn: (r) => r._measurement == \"cpu\")",
  "bucket": "telegraf"
}

Bucket must be in allowed_buckets config.

Error Codes

CodeMeaning
DATASOURCE_DISABLEDDatasource disabled in config
DATASOURCE_UNAVAILABLEFailed health check
INVALID_QUERYBad query syntax
INVALID_PATTERNBad regex
TIME_RANGE_EXCEEDEDRange exceeds max
BUCKET_NOT_ALLOWEDBucket not in allowlist
UPSTREAM_TIMEOUTRequest timed out
UPSTREAM_CLIENT_ERROR4xx from datasource
UPSTREAM_SERVER_ERROR5xx from datasource

Application Discovery

Generate a known applications file to speed up lookups:

# Using environment variables
python scripts/discover_applications.py --env

# Or with explicit credentials
python scripts/discover_applications.py \
  --url https://graylog.example.com \
  --token YOUR_TOKEN \
  --hours 24 \
  --environment "environment:prod" \
  --output known_applications.json

Output known_applications.json:

{
  "_metadata": {
    "generated_at": "2025-01-28T10:00:00",
    "identifier_fields_used": ["application", "service", "container_name"]
  },
  "environments": ["prod", "staging", "dev"],
  "applications": [
    {
      "name": "api-gateway",
      "identifier_fields": ["service", "application"],
      "aliases": [],
      "description": "",
      "team": "",
      "enabled": true
    }
  ]
}

Edit the file to:

  • Remove entries you don't need (enabled: false)
  • Add descriptions and team ownership
  • Add aliases for alternative names

Then set GRAYLOG_KNOWN_APPS_FILE=/path/to/known_applications.json in your environment.

Development

# Install with dev deps
pip install -e ".[dev]"

# Tests
pytest tests/ -v

# Coverage
pytest tests/ -v --cov=overwatch_mcp

Project Structure

src/overwatch_mcp/
├── __main__.py        # Entry point
├── server.py          # MCP server
├── config.py          # Config loader
├── cache.py           # TTL cache
├── clients/           # HTTP clients (graylog, prometheus, influxdb)
├── tools/             # MCP tool implementations
└── models/            # Pydantic models

127 tests (89 unit, 38 integration).

Usage Guide

See Docs/usage-guide.md for examples of how to ask questions:

  • Finding errors and investigating issues
  • Searching logs with filters and time ranges
  • Querying metrics and trends
  • Investigation workflows and common patterns

Troubleshooting

Server won't start: Check config/config.yaml exists and env vars are set.

Datasource unavailable: Verify URL, check token permissions. Server continues with available datasources.

Query errors: Check syntax (Lucene/PromQL/Flux), verify time range within limits, ensure bucket is allowlisted for InfluxDB.

License

MIT

Reviews

No reviews yet

Sign in to write a review