Text-to-speech MCP server for AI agents.

12 voices • 5 presets • 8 emotions • SSML-lite • SFX tags • multi-speaker dialogue
Swappable TTS backends. Guardrails built in. Ships as a single npx command.

Highlights

MCP native — stdio transport, works with Claude Desktop, Cursor, and any MCP client
5 tools — voice_speak, voice_dialogue, voice_status, voice_interrupt, voice_inner_monologue
12 approved voices — curated set with presets (narrator, announcer, whisper, storyteller, assistant)
Emotion spans — 8 emotions via [happy]...[/happy] inline markup
SSML-lite — <break>, <emphasis>, <prosody> without full SSML complexity
SFX tags — [ding], [chime], [whoosh], [tada], [error], [click] inline sound effects
Multi-speaker dialogue — Speaker: line format with auto-cast and pause directives
Guardrails — rate limiting, concurrency semaphore, request timeouts, path traversal protection, secret redaction
Swappable backends — Mock (built-in), HTTP proxy, Python bridge, or bring your own

Quick Start

npx @mcptoolshop/voice-soundboard-mcp

Or install globally:

npm install -g @mcptoolshop/voice-soundboard-mcp
voice-soundboard-mcp

Claude Desktop / MCP Client Config

Add to your MCP client configuration (e.g. claude_desktop_config.json):

{
  "mcpServers": {
    "voice-soundboard": {
      "command": "npx",
      "args": ["-y", "@mcptoolshop/voice-soundboard-mcp"]
    }
  }
}

With options:

{
  "mcpServers": {
    "voice-soundboard": {
      "command": "npx",
      "args": [
        "-y", "@mcptoolshop/voice-soundboard-mcp",
        "--artifact=path",
        "--output-dir=/tmp/voice-output",
        "--timeout=30000",
        "--max-concurrent=2"
      ]
    }
  }
}

MCP Tools

`voice_speak`

Synthesize speech from text.

text:         "Hello world!"
voice?:       "am_fenrir"          # Voice ID or preset name
speed?:       1.0                  # 0.5 - 2.0
format?:      "wav"                # wav | mp3 | ogg | raw
artifactMode?: "path"             # path | base64
sfx?:         true                # Enable [ding], [chime] etc.

`voice_dialogue`

Multi-speaker dialogue synthesis.

script:       "Alice: Hello!\nBob: Hey there!"
cast?:        { "Alice": "af_sky", "Bob": "am_fenrir" }
speed?:       1.0
concat?:      true                 # Combine into single file
debug?:       true                 # Include cue_sheet

`voice_status`

Returns engine health, available voices, presets, and backend info. No arguments.

`voice_interrupt`

Stop or rollback active synthesis.

streamId?:    "stream-123"
reason?:      "user_spoke"         # user_spoke | context_change | timeout | manual

`voice_inner_monologue`

Ephemeral micro-utterances for ambient narration. Requires --ambient flag or VOICE_SOUNDBOARD_AMBIENT_ENABLED=1.

text:         "Interesting..."     # Max 500 chars, auto-redacted
category?:    "thinking"           # general | thinking | observation | debug

Voices

ID	Name	Accent	Gender
`af_aoede`	Aoede	American	Female
`af_jessica`	Jessica	American	Female
`af_sky`	Sky	American	Female
`am_eric`	Eric	American	Male
`am_fenrir`	Fenrir	American	Male
`am_liam`	Liam	American	Male
`am_onyx`	Onyx	American	Male
`bf_alice`	Alice	British	Female
`bf_emma`	Emma	British	Female
`bf_isabella`	Isabella	British	Female
`bm_george`	George	British	Male
`bm_lewis`	Lewis	British	Male

Presets

Preset	Voice	Speed	Description
`narrator`	`bm_george`	0.95	Calm documentary style
`announcer`	`am_onyx`	1.05	News anchor energy
`whisper`	`af_aoede`	0.85	Soft, intimate
`storyteller`	`bf_emma`	0.90	Warm bedtime-story feel
`assistant`	`af_jessica`	1.0	Neutral, helpful

Emotion Spans

Wrap text in emotion tags to control prosody:

[happy]Great news![/happy] But [sad]I have to go.[/sad]

Supported: happy, sad, angry, fearful, surprised, disgusted, calm, excited

CLI Flags

Flag	Default	Description
`--artifact=path\|base64`	`path`	Audio delivery mode
`--output-dir=<path>`	`<tmpdir>/voice-soundboard/`	Output directory
`--backend=mock\|http`	`mock`	Backend selection
`--backend-url=<url>`	—	HTTP backend URL
`--ambient`	off	Enable inner-monologue system
`--max-concurrent=<n>`	`1`	Max concurrent synthesis requests
`--timeout=<ms>`	`20000`	Per-request timeout
`--retention-minutes=<n>`	`240`	Auto-cleanup age (0 to disable)

Packages

This is a pnpm monorepo with two publishable packages:

Package	Description	npm
`@mcptoolshop/voice-soundboard-core`	Backend-agnostic core library (validation, SSML, chunking, schemas)
`@mcptoolshop/voice-soundboard-mcp`	MCP server with CLI, guardrails, and transport

Development

# Install
pnpm install

# Build
pnpm build

# Test (342 tests)
pnpm test

# Lint
pnpm --filter @mcptoolshop/voice-soundboard-core exec ruff check .  # Python legacy
pnpm exec ruff check voice_soundboard/ tests/ --ignore=E501             # Python legacy

Project Structure

mcp-voice-soundboard/
  packages/
    core/               @mcptoolshop/voice-soundboard-core
      src/
        limits.ts         SHIP_LIMITS, text/chunk limits
        schemas.ts        VoiceRequest, VoiceResponse, error codes
        artifact.ts       resolveOutputDir, path sandbox
        voices.ts         Approved voice registry + presets
        emotion.ts        Emotion span parser
        ssml/             SSML-lite parser + limits
        chunking/         Text chunker
        sfx/              SFX tag parser + registry
        sandbox.ts        Safe filenames, symlink checks
        ambient.ts        AmbientEmitter for inner monologue
        redact.ts         PII/secret redaction
    mcp-server/         @mcptoolshop/voice-soundboard-mcp
      src/
        server.ts         MCP tool registration + guardrail wiring
        cli.ts            CLI entrypoint (stdio transport)
        backend.ts        Backend abstraction + mock/HTTP
        concurrency.ts    SynthesisSemaphore
        rateLimit.ts      ToolRateLimiter (sliding window)
        timeout.ts        withTimeout utility
        retention.ts      Output file cleanup timer
        redact.ts         Server-level redaction
        validation.ts     Synthesis result validation
        tools/            Individual tool handlers
  assets/               Logo, audio event manifests
  docs/                 Architecture docs

Security

See SECURITY.md for vulnerability reporting.

See THREAT_MODEL.md for the full threat surface analysis.

Project	Description
soundboard-plugin	Claude Code plugin — slash commands, emotion-aware narration

Support

Questions / help: Discussions
Bug reports: Issues
Security: SECURITY.md

License

MIT

Built by mcp-tool-shop

@mcptoolshop/voice-soundboard-mcp

Quick Install