VOICEVOX TTS MCP

English | 日本語

A text-to-speech MCP server using VOICEVOX

What You Can Do

Make your AI assistant speak — Text-to-speech from MCP clients like Claude Desktop
Multi-character conversations — Switch speakers per segment in a single call
Smooth playback — Queue management, immediate playback, prefetching, streaming
Cross-platform — Works on Windows, macOS, Linux (including WSL)

Quick Start

Requirements

Node.js 18.0.0 or higher
VOICEVOX Engine (must be running)
ffplay (optional, recommended)

Installing FFplay

ffplay is a lightweight player included with FFmpeg that supports playback from stdin. When available, it automatically enables low-latency streaming playback.

💡 FFplay is optional. Without it, playback falls back to temp file-based playback (Windows: PowerShell, macOS: afplay, Linux: aplay, etc.).

Easy setup: One-liner installation for each OS (see steps below)
Required: ffplay must be in PATH (restart terminal/apps after installation)

FFplay Installation and PATH Setup

Installation examples:

Windows (any of these)
- Winget: winget install --id=Gyan.FFmpeg -e
- Chocolatey: choco install ffmpeg
- Scoop: scoop install ffmpeg
- Official builds: Download from https://www.gyan.dev/ffmpeg/builds/ or https://github.com/BtbN/FFmpeg-Builds and add the bin folder to PATH
macOS
- Homebrew: brew install ffmpeg
Linux
- Debian/Ubuntu: sudo apt-get update && sudo apt-get install -y ffmpeg
- Fedora: sudo dnf install -y ffmpeg
- Arch: sudo pacman -S ffmpeg

PATH Setup:

Windows: Add ...\ffmpeg\bin to environment variables, then restart PowerShell/terminal and editor (Claude/VS Code, etc.)
- Verify: powershell -c "$env:Path" should include the ffmpeg path
macOS/Linux: Usually auto-detected. Check with echo $PATH if needed, restart shell.
MCP clients (Claude Desktop/Code): Restart the app to reload PATH.

Verification:

ffplay -version

If version info is displayed, installation is complete. CLI/MCP will automatically detect ffplay and use stdin streaming playback.

3 Steps to Get Started

1. Start VOICEVOX Engine

2. Add to Claude Desktop config file

Config file location:

Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "tts-mcp": {
      "command": "npx",
      "args": ["-y", "@kajidog/mcp-tts-voicevox"]
    }
  }
}

3. Restart Claude Desktop

That's it! Ask Claude to "say hello" and it will speak!

MCP Tools

`speak` — Text-to-Speech

The main feature callable from Claude.

Parameter	Description	Default
`text`	Text to speak (multiple segments separated by newlines)	Required
`speaker`	Speaker ID	1
`speedScale`	Playback speed	1.0
`immediate`	Immediate playback (clears queue)	true
`waitForEnd`	Wait for playback completion	false

Examples:

// Simple text
{ "text": "Hello" }

// Specify speaker
{ "text": "Hello", "speaker": 3 }

// Different speakers per segment
{ "text": "1:Hello\n3:Nice weather today" }

// Wait for completion (synchronous processing)
{ "text": "Wait for this to finish before continuing", "waitForEnd": true }

Other Tools

Tool	Description
`ping_voicevox`	Check VOICEVOX Engine connection
`get_speakers`	Get list of available speakers
`get_speaker_detail`	Get speaker details
`stop_speaker`	Stop playback and clear queue
`generate_query`	Generate speech synthesis query
`synthesize_file`	Generate audio file

Configuration

Environment Variables

VOICEVOX Settings

Variable	Description	Default
`VOICEVOX_URL`	Engine URL	`http://localhost:50021`
`VOICEVOX_DEFAULT_SPEAKER`	Default speaker ID	`1`
`VOICEVOX_DEFAULT_SPEED_SCALE`	Playback speed	`1.0`

Playback Options

Variable	Description	Default
`VOICEVOX_USE_STREAMING`	Streaming playback (requires `ffplay`)	`false`
`VOICEVOX_DEFAULT_IMMEDIATE`	Immediate playback	`true`
`VOICEVOX_DEFAULT_WAIT_FOR_START`	Wait for playback start	`false`
`VOICEVOX_DEFAULT_WAIT_FOR_END`	Wait for playback end	`false`

Restriction Settings

Restrict AI from specifying certain options.

Variable	Description
`VOICEVOX_RESTRICT_IMMEDIATE`	Restrict `immediate` option
`VOICEVOX_RESTRICT_WAIT_FOR_START`	Restrict `waitForStart` option
`VOICEVOX_RESTRICT_WAIT_FOR_END`	Restrict `waitForEnd` option

Disable Tools

# Disable unnecessary tools
export VOICEVOX_DISABLED_TOOLS=generate_query,synthesize_file

Server Settings

Variable	Description	Default
`MCP_HTTP_MODE`	Enable HTTP mode	`false`
`MCP_HTTP_PORT`	HTTP port	`3000`
`MCP_HTTP_HOST`	HTTP host	`0.0.0.0`
`MCP_ALLOWED_HOSTS`	Allowed hosts (comma-separated)	`localhost,127.0.0.1,[::1]`
`MCP_ALLOWED_ORIGINS`	Allowed origins (comma-separated)	`http://localhost,http://127.0.0.1,...`

Command Line Arguments

Command line arguments take priority over environment variables.

# Basic settings
npx @kajidog/mcp-tts-voicevox --url http://192.168.1.100:50021 --speaker 3 --speed 1.2

# HTTP mode
npx @kajidog/mcp-tts-voicevox --http --port 8080

# With restrictions
npx @kajidog/mcp-tts-voicevox --restrict-immediate --restrict-wait-for-end

# Disable tools
npx @kajidog/mcp-tts-voicevox --disable-tools generate_query,synthesize_file

Argument	Description
`--help`, `-h`	Show help
`--version`, `-v`	Show version
`--url <value>`	VOICEVOX Engine URL
`--speaker <value>`	Default speaker ID
`--speed <value>`	Playback speed
`--use-streaming` / `--no-use-streaming`	Streaming playback
`--immediate` / `--no-immediate`	Immediate playback
`--wait-for-start` / `--no-wait-for-start`	Wait for start
`--wait-for-end` / `--no-wait-for-end`	Wait for end
`--restrict-immediate`	Restrict immediate
`--restrict-wait-for-start`	Restrict waitForStart
`--restrict-wait-for-end`	Restrict waitForEnd
`--disable-tools <tools>`	Disable tools
`--http`	HTTP mode
`--port <value>`	HTTP port
`--host <value>`	HTTP host
`--allowed-hosts <hosts>`	Allowed hosts (comma-separated)
`--allowed-origins <origins>`	Allowed origins (comma-separated)

HTTP Mode

For remote connections:

Start Server:

# Linux/macOS
MCP_HTTP_MODE=true MCP_HTTP_PORT=3000 npx @kajidog/mcp-tts-voicevox

# Windows PowerShell
$env:MCP_HTTP_MODE='true'; $env:MCP_HTTP_PORT='3000'; npx @kajidog/mcp-tts-voicevox

Claude Desktop Config (using mcp-remote):

{
  "mcpServers": {
    "tts-mcp-proxy": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:3000/mcp"]
    }
  }
}

WSL to Windows Host Connection

Connecting from WSL to an MCP server running on Windows:

1. Get Windows Host IP from WSL

# Method 1: From default gateway
ip route show | grep -oP 'default via \K[\d.]+'
# Usually in the format 172.x.x.1

# Method 2: From /etc/resolv.conf (WSL2)
cat /etc/resolv.conf | grep nameserver | awk '{print $2}'

2. Start Server on Windows

Add the WSL gateway IP to MCP_ALLOWED_HOSTS to allow access from WSL:

$env:MCP_HTTP_MODE='true'
$env:MCP_ALLOWED_HOSTS='localhost,127.0.0.1,172.29.176.1'
npx @kajidog/mcp-tts-voicevox

Or with CLI arguments:

npx @kajidog/mcp-tts-voicevox --http --allowed-hosts "localhost,127.0.0.1,172.29.176.1"

3. WSL Configuration (.mcp.json)

{
  "mcpServers": {
    "tts": {
      "type": "http",
      "url": "http://172.29.176.1:3000/mcp"
    }
  }
}

⚠️ Within WSL, localhost refers to WSL itself. Use the WSL gateway IP to access the Windows host.

Troubleshooting

Audio is not playing

1. Check if VOICEVOX Engine is running

curl http://localhost:50021/speakers

2. Check platform-specific playback tools

OS	Required Tool
Linux	One of `aplay`, `paplay`, `play`, `ffplay`
macOS	`afplay` (pre-installed)
Windows	PowerShell (pre-installed)

Not recognized by MCP client

Check package installation: npm list -g @kajidog/mcp-tts-voicevox
Verify JSON syntax in config file
Restart the client

Package Structure

Package	Description
`@kajidog/mcp-tts-voicevox`	MCP server
`@kajidog/voicevox-client`	General-purpose VOICEVOX client library (can be used independently)

Developer Information

Setup

git clone https://github.com/kajidog/mcp-tts-voicevox.git
cd mcp-tts-voicevox
pnpm install

Commands

Command	Description
`pnpm build`	Build all packages
`pnpm test`	Run tests
`pnpm lint`	Run lint
`pnpm dev`	Start dev server
`pnpm dev:stdio`	Dev with stdio mode

License

ISC

@kajidog/mcp-tts-voicevox

Quick Install

VOICEVOX TTS MCP

What You Can Do

Quick Start

Requirements

Installing FFplay

3 Steps to Get Started

MCP Tools

`speak` — Text-to-Speech

Configuration

VOICEVOX Settings

Playback Options

Restriction Settings

Disable Tools

Server Settings

1. Get Windows Host IP from WSL

2. Start Server on Windows

3. WSL Configuration (.mcp.json)

Troubleshooting

Package Structure

Setup

Commands

License

Reviews

@kajidog/mcp-tts-voicevox

Quick Install

VOICEVOX TTS MCP

What You Can Do

Quick Start

Requirements

Installing FFplay

3 Steps to Get Started

MCP Tools

speak — Text-to-Speech

Configuration

VOICEVOX Settings

Playback Options

Restriction Settings

Disable Tools

Server Settings

1. Get Windows Host IP from WSL

2. Start Server on Windows

3. WSL Configuration (.mcp.json)

Troubleshooting

Package Structure

Setup

Commands

License

Reviews

`speak` — Text-to-Speech