MCP Hub
Back to servers

@kajidog/mcp-tts-voicevox

Requires Setup

A VOICEVOX engine integration for MCP that enables AI assistants to perform high-quality Japanese text-to-speech with multi-character support and low-latency streaming.

Stars
11
Forks
6
Tools
7
Updated
Dec 30, 2025
Validated
Jan 9, 2026

Quick Install

npx -y @kajidog/mcp-tts-voicevox

VOICEVOX TTS MCP

English | 日本語

A text-to-speech MCP server using VOICEVOX

What You Can Do

  • Make your AI assistant speak — Text-to-speech from MCP clients like Claude Desktop
  • Multi-character conversations — Switch speakers per segment in a single call
  • Smooth playback — Queue management, immediate playback, prefetching, streaming
  • Cross-platform — Works on Windows, macOS, Linux (including WSL)

Quick Start

Requirements

  • Node.js 18.0.0 or higher
  • VOICEVOX Engine (must be running)
  • ffplay (optional, recommended)

Installing FFplay

ffplay is a lightweight player included with FFmpeg that supports playback from stdin. When available, it automatically enables low-latency streaming playback.

💡 FFplay is optional. Without it, playback falls back to temp file-based playback (Windows: PowerShell, macOS: afplay, Linux: aplay, etc.).

  • Easy setup: One-liner installation for each OS (see steps below)
  • Required: ffplay must be in PATH (restart terminal/apps after installation)
FFplay Installation and PATH Setup

Installation examples:

  • Windows (any of these)

  • macOS

    • Homebrew: brew install ffmpeg
  • Linux

    • Debian/Ubuntu: sudo apt-get update && sudo apt-get install -y ffmpeg
    • Fedora: sudo dnf install -y ffmpeg
    • Arch: sudo pacman -S ffmpeg

PATH Setup:

  • Windows: Add ...\ffmpeg\bin to environment variables, then restart PowerShell/terminal and editor (Claude/VS Code, etc.)
    • Verify: powershell -c "$env:Path" should include the ffmpeg path
  • macOS/Linux: Usually auto-detected. Check with echo $PATH if needed, restart shell.
  • MCP clients (Claude Desktop/Code): Restart the app to reload PATH.

Verification:

ffplay -version

If version info is displayed, installation is complete. CLI/MCP will automatically detect ffplay and use stdin streaming playback.

3 Steps to Get Started

1. Start VOICEVOX Engine

2. Add to Claude Desktop config file

Config file location:

  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "tts-mcp": {
      "command": "npx",
      "args": ["-y", "@kajidog/mcp-tts-voicevox"]
    }
  }
}

3. Restart Claude Desktop

That's it! Ask Claude to "say hello" and it will speak!


MCP Tools

speak — Text-to-Speech

The main feature callable from Claude.

ParameterDescriptionDefault
textText to speak (multiple segments separated by newlines)Required
speakerSpeaker ID1
speedScalePlayback speed1.0
immediateImmediate playback (clears queue)true
waitForEndWait for playback completionfalse

Examples:

// Simple text
{ "text": "Hello" }

// Specify speaker
{ "text": "Hello", "speaker": 3 }

// Different speakers per segment
{ "text": "1:Hello\n3:Nice weather today" }

// Wait for completion (synchronous processing)
{ "text": "Wait for this to finish before continuing", "waitForEnd": true }
Other Tools
ToolDescription
ping_voicevoxCheck VOICEVOX Engine connection
get_speakersGet list of available speakers
get_speaker_detailGet speaker details
stop_speakerStop playback and clear queue
generate_queryGenerate speech synthesis query
synthesize_fileGenerate audio file

Configuration

Environment Variables

VOICEVOX Settings

VariableDescriptionDefault
VOICEVOX_URLEngine URLhttp://localhost:50021
VOICEVOX_DEFAULT_SPEAKERDefault speaker ID1
VOICEVOX_DEFAULT_SPEED_SCALEPlayback speed1.0

Playback Options

VariableDescriptionDefault
VOICEVOX_USE_STREAMINGStreaming playback (requires ffplay)false
VOICEVOX_DEFAULT_IMMEDIATEImmediate playbacktrue
VOICEVOX_DEFAULT_WAIT_FOR_STARTWait for playback startfalse
VOICEVOX_DEFAULT_WAIT_FOR_ENDWait for playback endfalse

Restriction Settings

Restrict AI from specifying certain options.

VariableDescription
VOICEVOX_RESTRICT_IMMEDIATERestrict immediate option
VOICEVOX_RESTRICT_WAIT_FOR_STARTRestrict waitForStart option
VOICEVOX_RESTRICT_WAIT_FOR_ENDRestrict waitForEnd option

Disable Tools

# Disable unnecessary tools
export VOICEVOX_DISABLED_TOOLS=generate_query,synthesize_file

Server Settings

VariableDescriptionDefault
MCP_HTTP_MODEEnable HTTP modefalse
MCP_HTTP_PORTHTTP port3000
MCP_HTTP_HOSTHTTP host0.0.0.0
MCP_ALLOWED_HOSTSAllowed hosts (comma-separated)localhost,127.0.0.1,[::1]
MCP_ALLOWED_ORIGINSAllowed origins (comma-separated)http://localhost,http://127.0.0.1,...
Command Line Arguments

Command line arguments take priority over environment variables.

# Basic settings
npx @kajidog/mcp-tts-voicevox --url http://192.168.1.100:50021 --speaker 3 --speed 1.2

# HTTP mode
npx @kajidog/mcp-tts-voicevox --http --port 8080

# With restrictions
npx @kajidog/mcp-tts-voicevox --restrict-immediate --restrict-wait-for-end

# Disable tools
npx @kajidog/mcp-tts-voicevox --disable-tools generate_query,synthesize_file
ArgumentDescription
--help, -hShow help
--version, -vShow version
--url <value>VOICEVOX Engine URL
--speaker <value>Default speaker ID
--speed <value>Playback speed
--use-streaming / --no-use-streamingStreaming playback
--immediate / --no-immediateImmediate playback
--wait-for-start / --no-wait-for-startWait for start
--wait-for-end / --no-wait-for-endWait for end
--restrict-immediateRestrict immediate
--restrict-wait-for-startRestrict waitForStart
--restrict-wait-for-endRestrict waitForEnd
--disable-tools <tools>Disable tools
--httpHTTP mode
--port <value>HTTP port
--host <value>HTTP host
--allowed-hosts <hosts>Allowed hosts (comma-separated)
--allowed-origins <origins>Allowed origins (comma-separated)
HTTP Mode

For remote connections:

Start Server:

# Linux/macOS
MCP_HTTP_MODE=true MCP_HTTP_PORT=3000 npx @kajidog/mcp-tts-voicevox

# Windows PowerShell
$env:MCP_HTTP_MODE='true'; $env:MCP_HTTP_PORT='3000'; npx @kajidog/mcp-tts-voicevox

Claude Desktop Config (using mcp-remote):

{
  "mcpServers": {
    "tts-mcp-proxy": {
      "command": "npx",
      "args": ["-y", "mcp-remote", "http://localhost:3000/mcp"]
    }
  }
}
WSL to Windows Host Connection

Connecting from WSL to an MCP server running on Windows:

1. Get Windows Host IP from WSL

# Method 1: From default gateway
ip route show | grep -oP 'default via \K[\d.]+'
# Usually in the format 172.x.x.1

# Method 2: From /etc/resolv.conf (WSL2)
cat /etc/resolv.conf | grep nameserver | awk '{print $2}'

2. Start Server on Windows

Add the WSL gateway IP to MCP_ALLOWED_HOSTS to allow access from WSL:

$env:MCP_HTTP_MODE='true'
$env:MCP_ALLOWED_HOSTS='localhost,127.0.0.1,172.29.176.1'
npx @kajidog/mcp-tts-voicevox

Or with CLI arguments:

npx @kajidog/mcp-tts-voicevox --http --allowed-hosts "localhost,127.0.0.1,172.29.176.1"

3. WSL Configuration (.mcp.json)

{
  "mcpServers": {
    "tts": {
      "type": "http",
      "url": "http://172.29.176.1:3000/mcp"
    }
  }
}

⚠️ Within WSL, localhost refers to WSL itself. Use the WSL gateway IP to access the Windows host.


Troubleshooting

Audio is not playing

1. Check if VOICEVOX Engine is running

curl http://localhost:50021/speakers

2. Check platform-specific playback tools

OSRequired Tool
LinuxOne of aplay, paplay, play, ffplay
macOSafplay (pre-installed)
WindowsPowerShell (pre-installed)
Not recognized by MCP client
  • Check package installation: npm list -g @kajidog/mcp-tts-voicevox
  • Verify JSON syntax in config file
  • Restart the client

Package Structure

PackageDescription
@kajidog/mcp-tts-voicevoxMCP server
@kajidog/voicevox-clientGeneral-purpose VOICEVOX client library (can be used independently)

Developer Information

Setup

git clone https://github.com/kajidog/mcp-tts-voicevox.git
cd mcp-tts-voicevox
pnpm install

Commands

CommandDescription
pnpm buildBuild all packages
pnpm testRun tests
pnpm lintRun lint
pnpm devStart dev server
pnpm dev:stdioDev with stdio mode

License

ISC

Reviews

No reviews yet

Sign in to write a review