MCP Hub
Back to servers

ComputeGauge MCP

Provides cost intelligence and a reputation scoring system to help AI agents optimize spending through smart model selection and local-to-cloud routing. It enables real-time cost tracking and rewards agents for making efficient, high-credibility decisions across various LLM providers.

Updated
Feb 17, 2026

@computegauge/mcp

The cost intelligence + agent credibility layer for AI agents. Install once, every session is cost-aware and credibility-building.

npm install -g @computegauge/mcp

Why This Exists

AI agents are powerful but expensive. A single Claude Code session can cost $5-50+. A multi-agent workflow can burn through $100 in minutes. And nobody tells the agent — or the human — until the bill arrives.

ComputeGauge MCP makes agents cost-aware AND gives them a credibility score. Any MCP-compatible agent (Claude, Cursor, Windsurf, custom agents) gets:

  1. Cost Intelligence — Know which model is optimal for every task, track spend in real-time
  2. Agent Credibility — Build a reputation score (0-1000) by making smart decisions. Compete on a leaderboard.
  3. Local→Cloud Routing — Detect when local inference isn't good enough, route to cloud, earn credibility for smart routing

The result: agents that spend 40-70% less, build visible credibility, and know when to route to cloud.


Setup — 30 Seconds

Claude Desktop / Claude Code

Add to ~/.config/claude/claude_desktop_config.json:

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"]
    }
  }
}

That's it. Restart Claude. Every conversation now has cost intelligence + credibility tracking.

With Provider API Keys (Enhanced)

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "OPENAI_API_KEY": "sk-...",
        "COMPUTEGAUGE_BUDGET_TOTAL": "50"
      }
    }
  }
}

With Local Inference (Ollama, vLLM, etc.)

{
  "mcpServers": {
    "computegauge": {
      "command": "npx",
      "args": ["-y", "@computegauge/mcp"],
      "env": {
        "OLLAMA_HOST": "http://localhost:11434",
        "OLLAMA_MODELS": "llama3.3:70b,qwen2.5:7b,deepseek-r1:14b",
        "ANTHROPIC_API_KEY": "sk-ant-...",
        "COMPUTEGAUGE_BUDGET_TOTAL": "50"
      }
    }
  }
}

Cursor

Add to Cursor MCP settings:

{
  "computegauge": {
    "command": "npx",
    "args": ["-y", "@computegauge/mcp"]
  }
}

Tools Reference

Agent-Native Tools (use automatically every session)

ToolWhen to CallWhat It DoesCredibility
pick_modelBefore any API requestReturns the optimal model for a task+8 Routing Intelligence
log_requestAfter any API requestLogs the request cost+3 Honest Reporting
session_costEvery 5-10 requestsShows cumulative cost and budget
rate_recommendationAfter completing a taskRate how well the model performed+5 Quality Contribution
model_ratingsWhen curious about qualityView model quality leaderboard
improvement_cycleAt session endRun continuous improvement engine+15 Quality Contribution
integrity_reportFor transparencyView rating acceptance/rejection stats

Credibility Tools (the reputation protocol)

ToolWhen to CallWhat It DoesCredibility
credibility_profileAnytimeView your 0-1000 credibility score, tier, badges
credibility_leaderboardTo competeSee how you rank vs other agents
route_to_cloudAfter local→cloud routingReport smart routing decision+70 Cloud Routing
assess_routingBefore choosing local vs cloudShould this task stay local?
cluster_statusTo check local capabilitiesView local endpoints, models, hardware

Intelligence Tools (for user questions)

ToolDescription
get_spend_summaryUser's total AI spend across all providers
get_budget_statusBudget utilization and alerts
get_model_pricingCurrent pricing for any model
get_cost_comparisonCompare costs for specific workloads
suggest_savingsActionable cost optimization recommendations
get_usage_trendSpend trends and anomaly detection

Resources

ResourceURIDescription
Configcomputegauge://configCurrent server configuration
Sessioncomputegauge://sessionReal-time session cost data
Ratingscomputegauge://ratingsModel quality leaderboard
Credibilitycomputegauge://credibilityAgent credibility profile + leaderboard
Clustercomputegauge://clusterLocal inference cluster status
Quickstartcomputegauge://quickstartAgent onboarding guide

Prompts

PromptDescription
cost_aware_systemSystem prompt that makes any agent cost-aware + credibility-building
daily_cost_reportGenerate a quick daily cost report
optimize_workflowAnalyze and optimize a described AI workflow

Agent Credibility System

Every smart decision earns credibility points on a 0-1000 scale:

CategoryHow to EarnPoints
🧠 Routing IntelligenceUsing pick_model wisely, avoiding overspec+8 to +15 per event
💰 Cost EfficiencyStaying under budget, significant savings+5 to +30 per event
✅ Task SuccessCompleting tasks successfully+10 to +25 per event
📊 Honest ReportingLogging requests, reporting failures honestly+3 to +10 per event
☁️ Cloud RoutingSmart local→cloud routing via ComputeGauge+25 to +70 per event
⭐ Quality ContributionRating models, running improvement cycles+5 to +15 per event

Credibility Tiers

TierScoreWhat It Means
⚪ Unrated0-99Just getting started
🥉 Bronze100-299Learning the ropes
🥈 Silver300-499Competent and cost-aware
🥇 Gold500-699Skilled optimizer
💎 Platinum700-849Elite decision-maker
👑 Diamond850-1000Best in class

Earnable Badges

BadgeHow to Earn
🌱 First StepsComplete first session
💰 Cost OptimizerSave >$10 through smart model selection
📊 Transparency ChampionLog 50+ requests accurately
☁️ Smart RouterSuccessfully route 10+ tasks to cloud
⭐ Quality PioneerSubmit 25+ model ratings
🔥 Streak Master20+ consecutive successful tasks
🥇 Gold AgentReach Gold tier (500+ score)
💎 Platinum AgentReach Platinum tier (700+ score)
👑 Diamond AgentReach Diamond tier (850+ score)
🌐 Hybrid IntelligenceUse both local and cloud models in one session

Local Cluster Integration

ComputeGauge auto-detects local inference endpoints:

PlatformEnvironment VariableDefault
OllamaOLLAMA_HOSThttp://localhost:11434
vLLMVLLM_HOST
llama.cppLLAMACPP_HOST
TGITGI_HOST
LocalAILOCALAI_HOST
CustomLOCAL_LLM_ENDPOINT

Set OLLAMA_MODELS="llama3.3:70b,qwen2.5:7b" (comma-separated) to declare available models.

The Local→Cloud Routing Flow

1. Agent calls assess_routing("code_generation", quality="good")
2. ComputeGauge checks: local llama3.3:70b quality for code_generation = 80/100
3. "Good" quality threshold = 78 → Local model is sufficient!
4. Agent uses local model → saves money → earns credibility for honest assessment

OR:

1. Agent calls assess_routing("complex_reasoning", quality="excellent")
2. ComputeGauge checks: local llama3.3:70b quality for complex_reasoning = 78/100
3. "Excellent" quality threshold = 88 → Quality gap of 10 points → Route to cloud!
4. Agent calls pick_model → gets Claude Sonnet 4 → executes → calls route_to_cloud
5. Agent earns +70 credibility points for smart routing decision

How pick_model Works

The decision engine scores every model across three dimensions:

Quality — Per-task-type scores for 14 task types Cost — Real pricing from 8 providers, 20+ models, calculated per-call (log-scale normalization) Speed — Relative inference speed scores

PriorityQualityCostSpeed
cheapest20%70%10%
balanced45%35%20%
best_quality70%10%20%
fastest25%15%60%

Model Coverage

ProviderModelsTier Range
AnthropicClaude Opus 4, Sonnet 4, Sonnet 3.5, Haiku 3.5Frontier → Budget
OpenAIo1, GPT-4o, o3-mini, GPT-4o-miniFrontier → Budget
GoogleGemini 2.0 Pro, 1.5 Pro, 2.0 FlashPremium → Budget
DeepSeekReasoner, ChatValue → Budget
GroqLlama 3.3 70B, Llama 3.1 8BValue → Budget
TogetherLlama 3.3 70B Turbo, Qwen 2.5 72BValue
MistralLarge, SmallPremium → Budget

Local Models Supported

ModelQuality (general)Best For
llama3.3:70b79/100General tasks, code
qwen2.5:72b81/100Code, math, translation
deepseek-r1:70b80/100Reasoning, math, code
deepseek-r1:14b68/100Budget reasoning
phi3:14b60/100Simple tasks
llama3.1:8b58/100Classification, simple QA
mistral:7b58/100Simple tasks

Environment Variables

VariableRequiredDescription
COMPUTEGAUGE_DASHBOARD_URLNoURL of ComputeGauge dashboard
COMPUTEGAUGE_API_KEYNoAPI key for dashboard access
COMPUTEGAUGE_BUDGET_TOTALNoSession budget limit in USD
COMPUTEGAUGE_BUDGET_ANTHROPICNoPer-provider monthly budget
COMPUTEGAUGE_BUDGET_OPENAINoPer-provider monthly budget
ANTHROPIC_API_KEYNoEnables Anthropic provider detection
OPENAI_API_KEYNoEnables OpenAI provider detection
GOOGLE_API_KEYNoEnables Google provider detection
OLLAMA_HOSTNoOllama inference endpoint
OLLAMA_MODELSNoComma-separated local model names
VLLM_HOSTNovLLM inference endpoint
COMPUTEGAUGE_GPUNoGPU name for hardware detection
COMPUTEGAUGE_VRAM_GBNoVRAM in GB
COMPUTEGAUGE_COST_PER_HOURNoAmortized hardware cost/hr

For Agent Developers

If you're building AI agents (via Claude Agent SDK, LangChain, CrewAI, AutoGen, etc.), ComputeGauge MCP is the easiest way to add cost awareness AND agent credibility:

  1. Zero integration effort — Just add the MCP server to your agent's config
  2. No code changes — The agent discovers 18 tools via MCP protocol automatically
  3. Immediate valuepick_model returns recommendations on first call, credibility tracking starts automatically
  4. Session tracking built-in — Full cost visibility per agent run
  5. Credibility system — Your agent earns a visible reputation score that users can see
  6. Local cluster support — Auto-detect and leverage on-prem inference
  7. Budget guardrails — Warnings when approaching limits

Pattern: Cost-Aware + Credibility-Building Agent Loop

1. Agent receives task
2. Agent calls assess_routing(task_type) → local or cloud?
3. Agent calls pick_model(task_type, priority="balanced")
4. Agent uses recommended model for the task
5. Agent calls log_request(provider, model, tokens)
6. Agent calls rate_recommendation(model, rating, success)
7. If cloud-routed: agent calls route_to_cloud(task_type, reason, model)
8. Every 5 requests, agent calls session_cost()
9. If session cost > 80% of budget, switch to priority="cheapest"
10. At session end: check credibility_profile()

This pattern reduces costs by 40-70% while building a credibility score that makes users trust the agent more.


License

Apache-2.0 — Free to use, modify, and distribute.

Links

Reviews

No reviews yet

Sign in to write a review