POST
/
v1
/
training
/
voice
curl --request POST \
  --url https://geoff.ai/api/v1/training/voice \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "reference_urls": [
      "https://files.geoff.ai/voices/sample1.wav",
      "https://files.geoff.ai/voices/sample2.wav",
      "https://files.geoff.ai/voices/sample3.wav"
    ],
    "name": "brand_voice_v1"
  }'
{
  "data": {
    "task_id": "trn_vox_abc123",
    "name": "brand_voice_v1",
    "voice_id": "brand_voice_v1",
    "status": "queued"
  },
  "trace_id": "04ede0ab069fb1ba8be5156a24b1e081"
}
Train a custom voice by averaging across multiple reference clips. Heavier than single-clip Voice Clone but more robust for distribution use (e.g. a brand voice that needs to sound consistent across hundreds of utterances). The trained voice is persisted to the catalog and appears in List Voices under scope=custom. Use the returned name as the voice_id on subsequent T2A calls.

Authorization

Authorization
string
required
Bearer token. Bearer API_key.

Request Body

reference_urls
array
required
Array of URLs pointing to reference audio clips. 3–10 clips of 5–30 seconds each is the sweet spot. Each clip should be clean speech with consistent room tone. Pre-process noisy field recordings with voice denoise first.
name
string
required
Stable snake-case identifier for the trained voice. Becomes the voice’s permanent voice_id for T2A calls. Lowercase letters, digits, and underscores only.
curl --request POST \
  --url https://geoff.ai/api/v1/training/voice \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "reference_urls": [
      "https://files.geoff.ai/voices/sample1.wav",
      "https://files.geoff.ai/voices/sample2.wav",
      "https://files.geoff.ai/voices/sample3.wav"
    ],
    "name": "brand_voice_v1"
  }'
{
  "data": {
    "task_id": "trn_vox_abc123",
    "name": "brand_voice_v1",
    "voice_id": "brand_voice_v1",
    "status": "queued"
  },
  "trace_id": "04ede0ab069fb1ba8be5156a24b1e081"
}

Single-clip alternative

For a quick clone from a single reference clip, use Voice Clone instead — it skips the multi-sample averaging step and returns a usable voice_id in seconds. For parameter-driven voices with no reference audio at all, use Voice Design.