MCP Hub
Back to servers

Robot Resources Scraper

Web scraper and token compressor that converts HTML to clean markdown with 70-80% fewer tokens. Single-page compression and multi-page BFS crawling with auto-fallback fetch modes.

glama
Forks
1
Updated
Mar 20, 2026

npm version License: MIT

@robot-resources/scraper-mcp

MCP server for Scraper — context compression for AI agents.

What is Robot Resources?

Human Resources, but for your AI agents.

Robot Resources gives AI agents two superpowers:

  • Router — Routes each LLM call to the cheapest capable model. 60-90% cost savings across OpenAI, Anthropic, and Google.
  • Scraper — Compresses web pages to clean markdown. 70-80% fewer tokens per page.

Both run locally. Your API keys never leave your machine. Free, unlimited, no tiers.

Install the full suite

npx robot-resources

One command sets up everything. Learn more at robotresources.ai


About this MCP server

This package gives AI agents two tools to compress web content into token-efficient markdown via the Model Context Protocol: single-page compression and multi-page BFS crawling.

Installation

npx @robot-resources/scraper-mcp

Or install globally:

npm install -g @robot-resources/scraper-mcp

Claude Desktop Configuration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "scraper": {
      "command": "npx",
      "args": ["-y", "@robot-resources/scraper-mcp"]
    }
  }
}

Tools

scraper_compress_url

Compress a single web page into markdown with 70-90% fewer tokens.

Parameters:

ParameterTypeRequiredDefaultDescription
urlstringyesURL to compress
modestringno'auto''fast', 'stealth', 'render', or 'auto'
timeoutnumberno10000Fetch timeout in milliseconds
maxRetriesnumberno3Max retry attempts (0-10)

Example prompt: "Compress https://docs.example.com/getting-started"

scraper_crawl_url

Crawl multiple pages from a starting URL using BFS link discovery.

Parameters:

ParameterTypeRequiredDefaultDescription
urlstringyesStarting URL to crawl
maxPagesnumberno10Max pages to crawl (1-100)
maxDepthnumberno2Max link depth (0-5)
modestringno'auto''fast', 'stealth', 'render', or 'auto'
includestring[]noURL patterns to include (glob)
excludestring[]noURL patterns to exclude (glob)
timeoutnumberno10000Per-page timeout in milliseconds

Example prompt: "Crawl the docs at https://docs.example.com with max 20 pages"

Fetch Modes

ModeHowUse when
'fast'Plain HTTPDefault sites, APIs, docs
'stealth'TLS fingerprint impersonationAnti-bot protected sites
'render'Headless browser (Playwright)JS-rendered SPAs
'auto'Fast → stealth fallback on 403/challengeUnknown sites (default)

Stealth requires impit and render requires playwright as peer dependencies of @robot-resources/scraper.

Requirements

  • Node.js 18+

Related

License

MIT

Reviews

No reviews yet

Sign in to write a review