Model Runner

Stop copy-pasting boilerplate every time you need to call a different AI model.

Model Runner is an MCP server that gives any AI assistant a unified interface to run completions, embeddings, image generation, and classification across all major providers. One tool call instead of per-provider API clients.

Quick Start

Add to your mcpServers config:

{
  "mcpServers": {
    "model-runner": {
      "url": "https://your-cloud-run-url/mcp"
    }
  }
}

Or run locally:

npm install
npm start

Before / After

Before: Your assistant wants to classify customer feedback into sentiment buckets. It cannot call the OpenAI API natively, does not know the exact endpoint shape, and cannot handle auth headers.

// 30 lines of fetch boilerplate. Per provider. Per project.
// Auth headers, message array format, error shapes, all different.

After: One tool call:

{
  "tool": "run_classification",
  "arguments": {
    "text": "Waited 40 minutes for support and got no answer",
    "labels": ["positive", "negative", "neutral"],
    "api_key": "sk-..."
  }
}

Output:

{
  "label": "negative",
  "confidence": 0.97,
  "reasoning": "The customer experienced a long wait with no resolution, indicating a clearly negative experience."
}

Tools

Tool	What it does
`list_supported_models`	Browse the full catalog of models by provider and capability
`run_completion`	Text generation via OpenAI, Anthropic, Groq, or Mistral
`run_embedding`	Vector embeddings via OpenAI or Cohere
`run_image_generation`	Image generation via DALL-E 2 or DALL-E 3
`estimate_tokens`	Estimate token count before making expensive API calls
`run_classification`	Zero-shot text classification with confidence scores

Who is this for?

AI assistant builders who want their agent to invoke ML models without hardcoding provider-specific API logic into every project
Developers prototyping who need a quick way to compare outputs across OpenAI, Anthropic, Groq, and Mistral without writing multiple API clients
Data teams running pipelines who want a single MCP endpoint to classify, embed, or summarize records at scale without managing provider SDKs

Health Check

Both endpoints return the same response and require no authentication:

GET /
GET /health

Response:

{
  "status": "ok",
  "server": "model-runner",
  "version": "1.0.0",
  "tools": 6
}

Built by

Mastermind HQ - AI tools built for builders.

License

MIT

Model Runner

Model Runner

Quick Start

Before / After

Tools

Who is this for?

Health Check

Built by

License

Reviews