molmoweb-mcp

MCP server that exposes MolmoWeb web automation as tools for Claude (or any MCP client). Uses Playwright for browser control.

Architecture

Claude / MCP Client
  ↓ stdio (MCP protocol)
molmoweb-mcp (this server)
  ↓                    ↓
Playwright browser   MolmoWeb API (localhost:8001)

Tool	Description
`molmoweb_check_status`	Health check for MolmoWeb backend
`browser_navigate`	Open URL in Playwright browser
`browser_screenshot`	Capture JPEG screenshot (returns base64 image)
`browser_get_page_info`	Get current URL and title
`browser_execute_action`	Execute click/type/scroll/press_key/hover/navigate/wait
`molmoweb_predict`	Ask MolmoWeb vision model what action to perform
`run_web_task`	Full autonomous agent loop (orchestrator + MolmoWeb + execution)

npm install
npx playwright install chromium

The MolmoWeb vision model must be running at http://127.0.0.1:8001. On Windows with WSL:

# Using the provided script:
run_molmoweb.bat

Add to your ~/.mcp.json (global) or project .mcp.json:

{
  "mcpServers": {
    "molmoweb": {
      "command": "node",
      "args": ["/path/to/molmoweb-mcp/server.js"]
    }
  }
}

npm start

The run_web_task tool uses an LLM orchestrator to decompose tasks into step-by-step browser actions. Supported providers:

MIT