MCP Hub
Back to servers

Speech AI - Pronunciation, TTS & STT

Pronunciation scoring, text-to-speech (12 voices), and speech-to-text with timestamps.

Registry
Updated
Feb 20, 2026

Speech AI Examples

API Status License: MIT MCP Azure Marketplace Demo

Production-ready examples for integrating Brainiall Speech AI APIs into your applications and AI agents.

APIs

APIModel SizeWhat It Does
Pronunciation Assessment17 MBScores pronunciation accuracy at word and phoneme level
Speech-to-Text (STT)17 MB (shared)Transcribes audio with word-level timestamps and confidence
Text-to-Speech (TTS)115 MBGenerates natural speech from text, 12 English voices (#1 TTS Arena)

All three models combined weigh under 150 MB and run on CPU. No GPU required. STT and Pronunciation share the same compact 17MB model.

Quick Start

1. Get an API Key

Subscribe on the Azure Marketplace or contact us at fasuizu@brainiall.com.

2. Set Your Key

export SPEECH_AI_API_KEY="your-subscription-key"

3. Run an Example

Python:

pip install httpx
python python/basic_usage.py

JavaScript (Node.js 18+):

node javascript/basic_usage.js

curl:

bash curl/examples.sh

Examples

FileDescription
python/basic_usage.pyAll 3 APIs in one script — assess, transcribe, synthesize
python/pronunciation_tutor.pyInteractive pronunciation tutor using all 3 APIs together
javascript/basic_usage.jsNode.js examples for all 3 APIs
curl/examples.shcurl commands for every endpoint
mcp/claude-desktop-config.jsonMCP config for Claude Desktop
mcp/cursor-config.jsonMCP config for Cursor IDE

MCP Integration

These APIs are available as MCP servers for AI agents and IDE integrations:

PlatformURLPricing
Smitherypronunciation-assessmentFree (discovery)
MCPizepronunciation-assessment$9.99/mo
Apifypronunciation-assessment-mcp$0.02/call

See the mcp/ directory for configuration examples.

Marketplaces

MarketplaceStatusLink
Azure MarketplaceLiveView Listing
AWS MarketplaceComing Soon

API Reference

Base URL

https://apim-ai-apis.azure-api.net

Authentication

All requests require the Ocp-Apim-Subscription-Key header:

Ocp-Apim-Subscription-Key: your-key-here

Pronunciation Assessment

POST /pronunciation/assess/base64
Content-Type: application/json

{
  "audio": "<base64-encoded-wav>",
  "text": "hello world",
  "format": "wav"
}

Response:

{
  "overallScore": 85.5,
  "words": [
    {
      "word": "hello",
      "score": 90.0,
      "phonemes": [
        {"phoneme": "HH", "score": 95.0},
        {"phoneme": "AH", "score": 85.0},
        {"phoneme": "L", "score": 92.0},
        {"phoneme": "OW", "score": 88.0}
      ]
    }
  ]
}

Speech-to-Text

POST /stt/transcribe/base64
Content-Type: application/json

{
  "audio": "<base64-encoded-wav>",
  "include_timestamps": true
}

Response:

{
  "text": "hello world",
  "language": "en",
  "words": [
    {"word": "hello", "start": 0.0, "end": 0.45},
    {"word": "world", "start": 0.50, "end": 0.95}
  ]
}

Text-to-Speech

POST /tts/synthesize
Content-Type: application/json

{
  "text": "Hello, welcome to Speech AI.",
  "voice": "af_heart",
  "speed": 1.0,
  "format": "wav"
}

Response: Binary WAV audio data.

Available TTS Voices

GET /tts/voices

Health Checks

GET /pronunciation/health
GET /stt/health
GET /tts/health

Try It Live

The HuggingFace Demo lets you test pronunciation assessment directly in your browser — no API key needed.

License

MIT — Brainiall

Reviews

No reviews yet

Sign in to write a review