Scrappey MCP Server
A Model Context Protocol (MCP) server for interacting with Scrappey.com's web automation and scraping capabilities. Try it out directly at smithery.ai/server/@pim97/mcp-server-scrappey.
Overview
This MCP server provides a bridge between AI models and Scrappey's web automation platform, allowing you to:
- Create and manage browser sessions
- Send HTTP requests through Scrappey's infrastructure
- Execute browser actions (clicking, typing, scrolling, etc.)
- Handle various anti-bot protections automatically (Cloudflare, Datadome, Kasada, etc.)
- Solve captchas automatically (Turnstile, reCAPTCHA, hCaptcha, etc.)
- Take screenshots and record videos
- Intercept network requests
Setup
Installation
npm install
npm run build
Configuration
- Get your Scrappey API key from Scrappey.com
- Set up your environment variable:
SCRAPPEY_API_KEY=your_api_key_here
Claude Desktop Configuration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"scrappey": {
"command": "node",
"args": ["path/to/dist/scrappey-mcp.js"],
"env": {
"SCRAPPEY_API_KEY": "your_api_key_here"
}
}
}
}
Available Tools
1. Create Session (scrappey_create_session)
Creates a new browser session that persists cookies and other state.
{
"proxy": "http://user:pass@ip:port",
"proxyCountry": "UnitedStates",
"premiumProxy": true,
"mobileProxy": false,
"browser": [{"name": "firefox", "minVersion": 120, "maxVersion": 130}],
"userAgent": "custom-user-agent"
}
2. Destroy Session (scrappey_destroy_session)
Properly closes a browser session to free resources.
{
"session": "session_id_here"
}
3. List Sessions (scrappey_list_sessions)
List all active sessions for the current user.
{}
Response:
{
"sessions": [{"session": "abc123", "lastAccessed": 1234567890}],
"open": 1,
"limit": 100
}
4. Check Session Active (scrappey_session_active)
Check if a specific session is currently active.
{
"session": "session_id_here"
}
5. Send Request (scrappey_request)
Send HTTP requests with antibot bypass capabilities.
{
"cmd": "request.get",
"url": "https://example.com",
"session": "session_id_here",
"postData": {"key": "value"},
"customHeaders": {"User-Agent": "custom-agent"},
"cookies": "session=abc123",
"proxyCountry": "Germany",
"premiumProxy": true,
"cloudflareBypass": true,
"datadomeBypass": true,
"automaticallySolveCaptchas": true,
"alwaysLoad": ["recaptcha", "hcaptcha"],
"screenshot": true,
"cssSelector": ".product-title",
"innerText": true,
"includeLinks": true,
"includeImages": true,
"interceptFetchRequest": "https://api.example.com/data",
"abortOnDetection": ["analytics.com", "tracking.js"],
"whitelistedDomains": ["example.com"],
"blockCookieBanners": true
}
6. Browser Actions (scrappey_browser_action)
Execute browser automation actions.
{
"session": "session_id_here",
"url": "https://example.com",
"cmd": "request.get",
"browserActions": [
{"type": "wait_for_selector", "cssSelector": "#login-form"},
{"type": "type", "cssSelector": "#username", "text": "myuser"},
{"type": "type", "cssSelector": "#password", "text": "mypassword"},
{"type": "solve_captcha", "captcha": "turnstile"},
{"type": "click", "cssSelector": "#submit", "waitForSelector": ".dashboard"},
{"type": "execute_js", "code": "document.querySelector('.user-data').innerText"}
],
"mouseMovements": true
}
Supported Browser Action Types:
| Action | Description |
|---|---|
click | Click on an element |
type | Type text into an input field |
goto | Navigate to a URL |
wait | Wait for specified milliseconds |
wait_for_selector | Wait for an element to appear |
wait_for_function | Wait for JavaScript condition to be true |
wait_for_load_state | Wait for page load state (domcontentloaded, networkidle, load) |
wait_for_cookie | Wait for a cookie to be set |
execute_js | Execute JavaScript code |
scroll | Scroll to element or page bottom |
hover | Hover over an element |
keyboard | Press keyboard keys (enter, tab, etc.) |
dropdown | Select option from dropdown |
switch_iframe | Switch to an iframe |
set_viewport | Change browser viewport size |
if | Conditional action execution |
while | Loop actions while condition is true |
solve_captcha | Solve various captcha types |
remove_iframes | Remove all iframes from page |
Supported Captcha Types:
turnstile- Cloudflare Turnstilerecaptcha/recaptchav2/recaptchav3- Google reCAPTCHAhcaptcha/hcaptcha_inside/hcaptcha_enterprise_inside- hCaptchafuncaptcha- FunCaptcha/Arkose Labsperimeterx- PerimeterXmtcaptcha- MTCaptchacustom- Custom image captcha
7. Screenshot (scrappey_screenshot)
Take a screenshot of a webpage.
{
"url": "https://example.com",
"session": "optional_session_id",
"screenshotWidth": 1920,
"screenshotHeight": 1080,
"fullPage": true,
"browserActions": [
{"type": "wait", "wait": 2000}
],
"premiumProxy": true
}
Antibot Bypass
The server automatically handles various protection systems:
- Cloudflare - Bot Management, Turnstile, Challenge pages
- Datadome - Advanced bot detection
- PerimeterX - Behavioral analysis
- Kasada - Fingerprinting and challenges
- Akamai - Bot Manager
- Incapsula - Imperva security
Enable specific bypasses:
{
"cloudflareBypass": true,
"datadomeBypass": true,
"kasadaBypass": true
}
Proxy Options
{
"proxy": "http://user:pass@ip:port",
"proxyCountry": "UnitedStates",
"premiumProxy": true,
"mobileProxy": true,
"noProxy": false
}
Supported Countries: UnitedStates, UnitedKingdom, Germany, France, and many more.
Error Codes
The server provides detailed error information:
| Code | Description |
|---|---|
| CODE-0001 | Server capacity full, try again |
| CODE-0002 | Cloudflare blocked |
| CODE-0007 | Turnstile/Proxy error |
| CODE-0010 | Datadome proxy blocked |
| CODE-0024 | Proxy timeout |
| CODE-0029 | Too many sessions open |
| CODE-0032 | Turnstile captcha failed |
Typical Workflow
- Create a session:
{"name": "scrappey_create_session"}
- Navigate and interact:
{
"name": "scrappey_browser_action",
"session": "returned_session_id",
"url": "https://example.com/login",
"cmd": "request.get",
"browserActions": [
{"type": "type", "cssSelector": "#username", "text": "myuser"},
{"type": "type", "cssSelector": "#password", "text": "mypass"},
{"type": "click", "cssSelector": "#login-btn", "waitForSelector": ".dashboard"}
]
}
- Extract data:
{
"name": "scrappey_request",
"cmd": "request.get",
"url": "https://example.com/data",
"session": "returned_session_id",
"cssSelector": ".product-list"
}
- Clean up:
{
"name": "scrappey_destroy_session",
"session": "returned_session_id"
}
Best Practices
- Reuse sessions for related requests to maintain state
- Destroy sessions when done to free resources
- Use premium proxies for better success rates on protected sites
- Enable automatic captcha solving for sites with challenges
- Use appropriate wait times between actions for human-like behavior
- Monitor session limits to avoid hitting limits
Deployment
Smithery Deployment
# Build
npm run build
# Deploy via Smithery CLI
npx @anthropic/smithery-cli deploy
Docker
docker build -t scrappey-mcp .
docker run -e SCRAPPEY_API_KEY=your_key scrappey-mcp
Resources
License
MIT License