ByteBot MCP Server
Production-grade Model Context Protocol (MCP) server for ByteBot's dual-API architecture, providing intelligent hybrid workflow orchestration for autonomous task execution and desktop computer control.
Overview
This MCP server integrates ByteBot's Agent API (task management) and Desktop API (computer control) into a unified interface for AI assistants like Claude. It enables:
- Autonomous Task Execution: Create and manage tasks for ByteBot to execute independently
- Direct Computer Control: Mouse, keyboard, screen capture, and file operations
- Hybrid Workflows: Intelligent orchestration with automatic monitoring and intervention handling
- Real-time Updates: Optional WebSocket support for live task status notifications
Features
Agent API Tools (Task Management)
bytebot_create_task- Create new tasks with priority levelsbytebot_list_tasks- List and filter tasks by status/prioritybytebot_get_task- Get detailed task information with message historybytebot_get_in_progress_task- Check currently running taskbytebot_update_task- Update task status or prioritybytebot_delete_task- Delete tasks
Desktop API Tools (Computer Control)
Mouse Operations:
bytebot_move_mouse- Move cursor to coordinatesbytebot_click- Click with left/right/middle buttonbytebot_drag- Drag from one position to anotherbytebot_scroll- Scroll in any direction
Keyboard Operations:
bytebot_type_text- Type text stringsbytebot_paste_text- Paste text (for special characters)bytebot_press_keys- Keyboard shortcuts (Ctrl+C, Alt+Tab, etc.)
Screen Operations:
bytebot_screenshot- Capture screen as base64 PNGbytebot_cursor_position- Get current cursor position
File I/O:
bytebot_read_file- Read file content (base64)bytebot_write_file- Write file content (base64)
System:
bytebot_switch_application- Switch to applicationbytebot_wait- Wait for specified duration
Hybrid Orchestration Tools (Priority 1)
bytebot_create_and_monitor_task- Create task and wait for completionbytebot_monitor_task- Monitor existing task until terminal statebytebot_intervene_in_task- Provide help when task needs interventionbytebot_execute_workflow- Multi-step workflow with automatic error recovery
Prerequisites
- Node.js: 20.x or higher
- ByteBot Instance: Running and accessible at configured endpoints
- Agent API (default:
http://localhost:9991) - Desktop API (default:
http://localhost:9990)
- Agent API (default:
Installation
# Clone or download this repository
cd bytebot-mcp-server
# Install dependencies
npm install
# Build TypeScript code
npm run build
Configuration
1. Create Environment File
Copy the example environment file and customize:
cp .env.example .env
2. Edit .env File
# ByteBot Agent API (Task Management)
BYTEBOT_AGENT_URL=http://localhost:9991
# ByteBot Desktop API (Computer Control)
BYTEBOT_DESKTOP_URL=http://localhost:9990
# WebSocket Configuration (Optional)
BYTEBOT_WS_URL=ws://localhost:9991
ENABLE_WEBSOCKET=false
# Server Configuration
MCP_SERVER_NAME=bytebot-mcp
# Timeouts (milliseconds)
REQUEST_TIMEOUT=30000
DESKTOP_ACTION_TIMEOUT=10000
# Retry Configuration
MAX_RETRIES=3
RETRY_DELAY=1000
# Monitoring Configuration
TASK_POLL_INTERVAL=2000
TASK_MONITOR_TIMEOUT=300000
# File Configuration
MAX_FILE_SIZE=10485760
# Logging
LOG_LEVEL=info
3. Remote ByteBot Configuration
If ByteBot is running on a remote server:
BYTEBOT_AGENT_URL=http://your-server.com:9991
BYTEBOT_DESKTOP_URL=http://your-server.com:9990
BYTEBOT_WS_URL=ws://your-server.com:9991
MCP Client Setup
Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"bytebot": {
"command": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"],
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
}
}
Zed Editor
Add to your Zed settings:
{
"context_servers": {
"bytebot": {
"command": {
"path": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"]
},
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
}
}
Continue.dev
Add to .continue/config.json:
{
"mcpServers": [
{
"name": "bytebot",
"command": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"],
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
]
}
Usage Examples
Example 1: Basic Task Creation
User: Create a task for ByteBot to search Wikipedia for "quantum computing"
Claude uses: bytebot_create_task
{
"description": "Go to wikipedia.org and search for 'quantum computing'",
"priority": "MEDIUM"
}
Response:
{
"id": "task-123",
"status": "PENDING",
"priority": "MEDIUM",
"createdAt": "2024-01-15T10:30:00Z"
}
Example 2: Hybrid Workflow (Create → Monitor → Complete)
User: Create a task to log into example.com and wait for it to complete
Claude uses: bytebot_create_and_monitor_task
{
"description": "Navigate to example.com and log in with credentials from keychain",
"timeout": 60000,
"pollInterval": 2000
}
Response:
{
"taskId": "task-456",
"finalStatus": "COMPLETED",
"completedAt": "2024-01-15T10:31:45Z",
"messagesCount": 12,
"task": { ... full task details ... }
}
Example 3: Task Needs Intervention
User: Create a task to fill out a complex form
Claude uses: bytebot_create_and_monitor_task
{
"description": "Fill out the registration form at example.com/register"
}
Response (after monitoring):
{
"taskId": "task-789",
"finalStatus": "NEEDS_HELP",
"task": {
"id": "task-789",
"status": "NEEDS_HELP",
"messages": [
{
"role": "assistant",
"content": "I need the user's phone number to complete this form"
}
]
}
}
User: My phone number is 555-1234
Claude uses: bytebot_intervene_in_task
{
"taskId": "task-789",
"message": "User's phone number is 555-1234",
"action": "resume",
"continueMonitoring": true
}
Response:
{
"taskId": "task-789",
"status": "COMPLETED",
"intervention": "applied"
}
Example 4: Interactive Desktop Control
User: Take a screenshot and click at position (500, 300)
Claude uses: bytebot_screenshot
Response: { "screenshot": "iVBORw0KG..." }
Claude uses: bytebot_click
{
"x": 500,
"y": 300,
"button": "left"
}
Response: ✓ bytebot_click completed successfully
Example 5: Multi-Step Workflow
User: Execute a workflow to open Firefox, navigate to GitHub, and take a screenshot
Claude uses: bytebot_execute_workflow
{
"steps": [
{
"name": "Open Firefox",
"description": "Switch to Firefox browser application"
},
{
"name": "Navigate to GitHub",
"description": "Navigate to github.com in the browser"
},
{
"name": "Take Screenshot",
"description": "Capture a screenshot of the GitHub homepage"
}
],
"priority": "HIGH"
}
Response:
{
"steps": [
{ "name": "Open Firefox", "taskId": "task-001", "status": "COMPLETED" },
{ "name": "Navigate to GitHub", "taskId": "task-002", "status": "COMPLETED" },
{ "name": "Take Screenshot", "taskId": "task-003", "status": "COMPLETED" }
],
"overallStatus": "completed",
"totalInterventions": 0
}
Example 6: File Operations
User: Read the contents of /home/user/data.txt
Claude uses: bytebot_read_file
{
"path": "/home/user/data.txt"
}
Response: { "content": "SGVsbG8gV29ybGQh..." } // Base64 encoded
Troubleshooting
Error: "Cannot connect to ByteBot server"
Cause: ByteBot is not running or endpoint URL is incorrect
Solution:
- Verify ByteBot is running:
curl http://localhost:9991/tasks - Check
.envfile has correct URLs - Ensure no firewall blocking connections
Error: "Request to ByteBot timed out"
Cause: Task took longer than configured timeout
Solution:
- Increase
REQUEST_TIMEOUTin.envfor Agent API calls - Increase
DESKTOP_ACTION_TIMEOUTfor Desktop API calls - Use
bytebot_create_and_monitor_taskwith custom timeout:{ "description": "Long running task", "timeout": 600000 }
Error: "Task with ID xyz not found"
Cause: Task was deleted or ID is incorrect
Solution:
- List all tasks:
bytebot_list_tasks - Verify task ID from response
- Check if task was accidentally deleted
Warning: "Screenshot size is 8.5MB"
Cause: Screenshot is very large (high resolution display)
Solution:
- This is just a warning, screenshot still works
- Consider reducing screen resolution if frequently capturing screenshots
- Screenshots >5MB will show this warning
Error: "Task must be in NEEDS_HELP state"
Cause: Attempting to intervene in task that doesn't need help
Solution:
- Check task status first:
bytebot_get_task - Only use
bytebot_intervene_in_taskwhen status isNEEDS_HELP - Use
bytebot_update_taskto manually change status if needed
WebSocket Connection Failed
Cause: WebSocket URL incorrect or ByteBot doesn't support WebSocket
Solution:
- Set
ENABLE_WEBSOCKET=falsein.envto disable WebSocket - Server will automatically fall back to HTTP polling
- WebSocket is optional - all features work without it
Error: "File size exceeds maximum allowed size"
Cause: Trying to upload/read file larger than 10MB
Solution:
- Increase
MAX_FILE_SIZEin.env(in bytes) - Split large files into smaller chunks
- Compress files before uploading
API Reference
Task Priority Levels
LOW- Background tasks, non-urgentMEDIUM- Default priority (recommended)HIGH- Important tasks, process soonURGENT- Critical tasks, process immediately
Task Lifecycle States
PENDING- Task created, waiting to startIN_PROGRESS- Task currently executingNEEDS_HELP- Task blocked, requires interventionNEEDS_REVIEW- Task complete but needs verificationCOMPLETED- Task finished successfullyCANCELLED- Task cancelled by userFAILED- Task failed with error
Mouse Buttons
left- Primary button (default)right- Context menu buttonmiddle- Scroll wheel click
Scroll Directions
up- Scroll updown- Scroll downleft- Scroll leftright- Scroll right
Common Applications
firefox- Mozilla Firefoxchrome- Google Chromesafari- Safari (macOS)terminal- Terminal/Command Promptvscode- Visual Studio Code
Architecture
┌─────────────────────────────────────────────┐
│ MCP Client (Claude) │
└─────────────────┬───────────────────────────┘
│ stdio transport
┌─────────────────▼───────────────────────────┐
│ ByteBot MCP Server │
│ ┌────────────────────────────────────────┐ │
│ │ Agent Tools │ Desktop Tools │ │
│ │ Hybrid Orchestrator │ │
│ └────────────┬──────────────┬─────────────┘ │
└───────────────┼──────────────┼───────────────┘
│ │
┌──────────▼──┐ ┌──────▼──────┐
│ Agent API │ │ Desktop API │
│ (port 9991) │ │ (port 9990) │
└─────────────┘ └─────────────┘
│ │
┌──────▼───────────────────▼──────┐
│ ByteBot Instance │
└─────────────────────────────────┘
Development
Build
npm run build
Type Check
npm run type-check
Watch Mode
npm run dev
Environment Variables Reference
| Variable | Default | Description |
|---|---|---|
BYTEBOT_AGENT_URL | http://localhost:9991 | ByteBot Agent API endpoint |
BYTEBOT_DESKTOP_URL | http://localhost:9990 | ByteBot Desktop API endpoint |
BYTEBOT_WS_URL | ws://localhost:9991 | WebSocket endpoint for real-time updates |
ENABLE_WEBSOCKET | false | Enable WebSocket connections |
MCP_SERVER_NAME | bytebot-mcp | Server identifier |
REQUEST_TIMEOUT | 30000 | HTTP request timeout (ms) |
DESKTOP_ACTION_TIMEOUT | 10000 | Desktop action timeout (ms) |
MAX_RETRIES | 3 | Maximum retry attempts for failed requests |
RETRY_DELAY | 1000 | Initial retry delay (ms) |
TASK_POLL_INTERVAL | 2000 | Task status polling interval (ms) |
TASK_MONITOR_TIMEOUT | 300000 | Maximum task monitoring duration (ms) |
MAX_FILE_SIZE | 10485760 | Maximum file size in bytes (10MB) |
LOG_LEVEL | info | Logging level (debug/info/warn/error) |
License
MIT
Support
For issues and questions:
- ByteBot Documentation: https://docs.bytebot.ai
- MCP Specification: https://modelcontextprotocol.io
- Report issues: Create an issue in this repository
Version History
1.0.0 (2024-01-15)
- Initial release
- Agent API integration (task management)
- Desktop API integration (computer control)
- Hybrid orchestration tools
- WebSocket support for real-time updates
- Comprehensive error handling and retry logic
- Full TypeScript implementation with strict typing