Voice Tools
Voice Emotion
Re-render speech with a specified emotional tone — from text + voice_id, or from existing source audio.
POST
Apply an emotional tone to speech. Two input shapes:
- From text — pass
textplus avoice_id; the engine synthesizes directly with the requested emotion (fastest path). - From source audio — pass
audio_url; the engine transcribes, re-renders with the requested emotion in the same voice, and returns the new audio (preserves the original speaker).
Authorization
Bearer token.
Bearer API_key.Request Body
Target emotional tone. Options:
neutral, happy, sad, angry,
fearful, surprised, calm, excited.Text to synthesize. Mutually exclusive with
audio_url. When set,
voice_id should also be supplied to pick the target voice.Voice to render the synthesized text in. Required when
text is
set; ignored when audio_url is set (the source speaker is
preserved).Source audio URL to re-render. Mutually exclusive with
text.How strongly to apply the emotion. Range:
0 to 1. Default:
0.7. Values above 0.85 can over-stylize — sweep at lower
values first.Output audio format. Options:
wav, mp3. Default: wav.When to use what
- Designing a new voice with an inherent style — use
Voice Design with
emotionbaked into the profile (the voice is created with that tone). - One-off emotional re-rendering of existing speech — use this
endpoint with
audio_url. - Single utterance in an existing voice + emotion — use this
endpoint with
text+voice_id.