Add Voiceover Track

Overview

Add custom Text-to-Speech (TTS) voiceover voices from external platforms like Google Cloud, AWS Polly, and ElevenLabs to your Pictory voiceover library. While Pictory provides access to a pre-configured list of voices from these platforms, you may discover additional voices available on these platforms that are not in Pictory’s default library. This endpoint enables you to add such custom voices by providing the relevant voice details and configuration.

You need a valid API key to use this endpoint. Get your API key from the API Access page in your Pictory dashboard.

API Endpoint

POST https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks

Request Parameters

Headers

Authorization

string

required

API key for authentication (starts with pictai_)

Authorization: YOUR_API_KEY

Body Parameters

name

string

required

Unique name for the voiceover track. This name will be displayed in your Pictory voiceover library.Example: "Arnold"

description

string

Optional description of the voiceover to help identify its characteristics.Example: "A deep male voice"

engine

string

required

The TTS engine to use. Valid values depend on the service:

AWS (service: "aws"): neural, standard
Google (service: "google"): standard, WaveNet, Neural2
ElevenLabs (service: "elevenlabs"): eleven_v3 (latest, most expressive), eleven_multilingual_v2 (best for long-form), eleven_flash_v2_5 (ultra-fast, multilingual), eleven_turbo_v2_5 (balanced quality/speed), eleven_flash_v2 (ultra-fast, English only), eleven_turbo_v2 (English only), eleven_monolingual_v1 (deprecated), eleven_multilingual_v1 (deprecated)

Example: "WaveNet"

service

string

required

The TTS provider service.Allowed values: aws, google, elevenlabsExample: "google"

language

string

required

The language code in IETF format (e.g., en-US, en-GB, fr-FR).Example: "en-US"

voiceId

string

required

Unique identifier for the voice from the external TTS provider.

Google: Voice name (e.g., "en-US-Journey-D")
AWS: Voice name (e.g., "Danielle")
ElevenLabs: Unique voice ID from the voice library

Example: "en-US-Journey-D"

accent

string

required

The accent or regional variant of the voice.Example: "American"

age

string

required

The age group of the voice.Typical values: Child, Teen, Adult, SeniorExample: "Adult"

gender

string

required

The gender of the voice.Typical values: Male, Female, NeutralExample: "Male"

sample

string (uri)

Optional URL to an audio sample (MP3) for previewing this voice.Example: "https://example.com/sample.mp3"

publicUserId

string

Public user ID used to identify ElevenLabs users. Required when sharing a voice from your ElevenLabs account to Pictory.Valid only for: service: "elevenlabs"

elevenlabsVoiceSettings

object

Settings specific to ElevenLabs voices. Valid only when service is elevenlabs.

Show elevenlabsVoiceSettings properties

elevenlabsVoiceSettings.similarityBoost

number

required

The similarity boost setting for voice quality (0.0 to 1.0)

elevenlabsVoiceSettings.stability

number

required

The stability setting for consistent voice output (0.0 to 1.0)

elevenlabsVoiceSettings.useSpeakerBoost

boolean

Whether to use speaker boost for enhanced voice clarity

Response

Returns the created voiceover track object with its assigned ID and configuration details.

integer

Unique identifier for the voiceover track in your Pictory voiceover library

name

string

The name of the voiceover track

accent

string

The accent of the voice

gender

string

The gender of the voice

language

string

The language code of the voice

sample

string (uri)

The sample URL of the voice track

service

string

The TTS service provider (aws, google, or elevenlabs)

engine

string

The voice synthesis engine used

Response Examples

{
  "accent": "American",
  "category": "standard",
  "engine": "WaveNet",
  "gender": "Male",
  "id": 12345,
  "language": "en-US",
  "name": "Arnold",
  "sample": "https://example.com/sample.mp3",
  "service": "google",
  "ssmlHelp": "https://docs.pictory.ai/docs/supported-ssml-tags#category-c",
  "ssmlSupportCategory": "C"
}

Code Examples

Replace YOUR_API_KEY with your actual API key that starts with pictai_

# Add a Google WaveNet voice
curl --request POST \
  --url https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks \
  --header 'Authorization: YOUR_API_KEY' \
  --header 'Content-Type: application/json' \
  --data '{
    "engine": "WaveNet",
    "service": "google",
    "name": "Arnold",
    "description": "A deep male voice",
    "language": "en-US",
    "voiceId": "en-US-Journey-D",
    "accent": "American",
    "age": "Adult",
    "gender": "Male",
    "sample": "https://example.com/sample.mp3"
  }' | python -m json.tool

Usage Notes

Voice Availability: The voice must exist in the external TTS provider’s catalog before you can add it to Pictory. Verify the voice ID is correct for the selected service.

Unique Names: Each voice name must be unique within your Pictory library. If a voice with the same name exists, the request will fail with a 409 error.

Engine Compatibility: Ensure you use the correct engine value for your selected service. Using an incompatible engine will result in a validation error.

ElevenLabs Voices: When adding ElevenLabs voices, you must provide publicUserId and elevenlabsVoiceSettings for proper voice configuration and sharing.

Service-Specific Examples

Add Google Cloud Voice

Add a Google WaveNet voice to your library:

import requests

def add_google_voice(api_key, voice_id, name, language, accent, gender, age):
    """
    Add a Google Cloud TTS voice to Pictory
    """
    url = "https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks"
    headers = {
        "Authorization": api_key,
        "Content-Type": "application/json"
    }

    voice_data = {
        "engine": "WaveNet",  # or "Neural2" or "standard"
        "service": "google",
        "name": name,
        "language": language,
        "voiceId": voice_id,
        "accent": accent,
        "age": age,
        "gender": gender
    }

    response = requests.post(url, headers=headers, json=voice_data)

    if response.status_code == 201:
        result = response.json()
        print(f"✓ Added Google voice: {result['name']} (ID: {result['id']})")
        return result
    else:
        print(f"✗ Failed to add voice: {response.json().get('message')}")
        return None

# Example usage
add_google_voice(
    api_key="YOUR_API_KEY",
    voice_id="en-US-Journey-D",
    name="Journey Male",
    language="en-US",
    accent="American",
    gender="Male",
    age="Adult"
)

Add AWS Polly Voice

Add an AWS Polly neural voice:

async function addAwsVoice(apiKey, voiceName, language, accent, gender, age) {
  const voiceData = {
    engine: 'neural',  // or 'standard'
    service: 'aws',
    name: voiceName,
    language: language,
    voiceId: voiceName,  // AWS uses voice name as ID
    accent: accent,
    age: age,
    gender: gender
  };

  const response = await fetch(
    'https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks',
    {
      method: 'POST',
      headers: {
        'Authorization': `${apiKey}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify(voiceData)
    }
  );

  if (response.status === 201) {
    const result = await response.json();
    console.log(`✓ Added AWS voice: ${result.name} (ID: ${result.id})`);
    return result;
  } else {
    const error = await response.json();
    console.log(`✗ Failed: ${error.message}`);
    return null;
  }
}

// Example usage
await addAwsVoice(
  'YOUR_API_KEY',
  'Danielle',
  'en-US',
  'American',
  'Female',
  'Adult'
);

Add ElevenLabs Voice

Add an ElevenLabs voice with custom settings:

import requests

def add_elevenlabs_voice(api_key, voice_id, public_user_id, name, language, accent, gender, age):
    """
    Add an ElevenLabs voice to Pictory
    """
    url = "https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks"
    headers = {
        "Authorization": api_key,
        "Content-Type": "application/json"
    }

    voice_data = {
        "engine": "eleven_multilingual_v2",
        "service": "elevenlabs",
        "name": name,
        "language": language,
        "voiceId": voice_id,
        "publicUserId": public_user_id,
        "accent": accent,
        "age": age,
        "gender": gender,
        "elevenlabsVoiceSettings": {
            "similarityBoost": 0.75,
            "stability": 0.50,
            "useSpeakerBoost": True
        }
    }

    response = requests.post(url, headers=headers, json=voice_data)

    if response.status_code == 201:
        result = response.json()
        print(f"✓ Added ElevenLabs voice: {result['name']} (ID: {result['id']})")
        print(f"  Category: {result['category']}")
        return result
    else:
        print(f"✗ Failed: {response.json().get('message')}")
        return None

# Example usage
add_elevenlabs_voice(
    api_key="YOUR_API_KEY",
    voice_id="21m00Tcm4TlvDq8ikWAM",
    public_user_id="your_elevenlabs_public_id",
    name="Rachel Premium",
    language="en-US",
    accent="American",
    gender="Female",
    age="Adult"
)

Batch Add Multiple Voices

Add multiple voices from different providers:

import requests

def batch_add_voices(api_key, voices):
    """
    Add multiple voices to Pictory library
    """
    url = "https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks"
    headers = {
        "Authorization": api_key,
        "Content-Type": "application/json"
    }

    results = {
        'success': [],
        'failed': []
    }

    for voice in voices:
        response = requests.post(url, headers=headers, json=voice)

        if response.status_code == 201:
            result = response.json()
            results['success'].append(result['name'])
            print(f"✓ Added: {result['name']} ({result['service']})")
        else:
            error = response.json()
            results['failed'].append({
                'name': voice.get('name'),
                'error': error.get('message')
            })
            print(f"✗ Failed: {voice.get('name')}")

    print(f"\nSummary: {len(results['success'])} added, {len(results['failed'])} failed")
    return results

# Example usage
voices_to_add = [
    {
        "engine": "WaveNet",
        "service": "google",
        "name": "Google Journey D",
        "language": "en-US",
        "voiceId": "en-US-Journey-D",
        "accent": "American",
        "age": "Adult",
        "gender": "Male"
    },
    {
        "engine": "neural",
        "service": "aws",
        "name": "AWS Danielle",
        "language": "en-US",
        "voiceId": "Danielle",
        "accent": "American",
        "age": "Adult",
        "gender": "Female"
    }
]

batch_add_voices("YOUR_API_KEY", voices_to_add)

Engine and Service Compatibility

Valid Engine Values by Service

AWS Polly

Service: awsValid Engines:

neural - High-quality neural voices
standard - Standard quality voices

Example Voice IDs: Joanna, Matthew, Danielle, Gregory

Google Cloud Text-to-Speech

Service: googleValid Engines:

standard - Basic quality
WaveNet - Premium quality
Neural2 - Latest generation

Example Voice IDs: en-US-Journey-D, en-GB-Wavenet-A, en-US-Neural2-A

ElevenLabs

Service: elevenlabsValid Engines:

eleven_v3 - Latest and most advanced model with human-like, expressive speech across 70+ languages (recommended for emotional/dramatic content)
eleven_multilingual_v2 - Most lifelike model with rich emotional expression across 29 languages (recommended for professional long-form content)
eleven_flash_v2_5 - Ultra-fast model (~75ms latency) across 32 languages at 50% lower cost (recommended for real-time/streaming use cases)
eleven_turbo_v2_5 - Balanced quality and speed (~250-300ms latency) across 32 languages at 50% lower cost (recommended for quality-speed balance)
eleven_flash_v2 - Ultra-fast model (~75ms latency), English only
eleven_turbo_v2 - High quality with low latency (~250-300ms), English only
eleven_monolingual_v1 - First generation model, English only (deprecated - migrate to eleven_multilingual_v2)
eleven_multilingual_v1 - First generation multilingual model, 8 languages (deprecated - migrate to eleven_multilingual_v2)

Required Fields: publicUserId, elevenlabsVoiceSettingsVoice ID: Unique voice ID from your ElevenLabs voice library

Error Handling

400 Bad Request - Invalid Engine

Cause: Incompatible engine value for the selected serviceSolution:

Verify the engine is valid for your service (see compatibility table above)
For AWS: use neural or standard
For Google: use standard, WaveNet, or Neural2
For ElevenLabs: use eleven_monolingual_v1, eleven_multilingual_v1, eleven_multilingual_v2, eleven_flash_v2, eleven_flash_v2_5, eleven_turbo_v2, eleven_turbo_v2_5, or eleven_v3

400 Bad Request - Missing Required Fields

Cause: Required fields are missing from the requestSolution:

Ensure all required fields are included: name, engine, service, language, voiceId, accent, age, gender
For ElevenLabs, also include publicUserId and elevenlabsVoiceSettings
Verify field values match expected types and formats

401 Unauthorized

Cause: Invalid or missing API keySolution:

Verify your API key is correct and starts with pictai_
Check the Authorization header is properly formatted: YOUR_API_KEY
Ensure your API key hasn’t expired

409 Conflict - Duplicate Voice Name

Cause: A voice with the same name already exists in your librarySolution:

Choose a different unique name for the voice
Use the Get Voiceover Tracks endpoint to check existing voice names
Consider adding a prefix or suffix to differentiate similar voices

Best Practices

Voice Configuration

Descriptive Names: Use clear, descriptive names that indicate the voice characteristics
Accurate Metadata: Provide accurate accent, age, and gender information for better voice discovery
Sample URLs: Include sample URLs whenever possible to help users preview voices
Language Codes: Use standard IETF language codes (e.g., en-US, not english)

Service Selection

AWS Polly: Best for cost-effective, high-quality neural voices
Google Cloud: Best for premium WaveNet quality and broad language support
ElevenLabs: Best for ultra-realistic, expressive voices with fine-tuned control

Voice Management

Test voices in the external platform before adding to Pictory
Keep voice names consistent with your naming convention
Document custom voice configurations for team reference
Regular audit of custom voices to remove unused entries

Getting started

Videos

Video Storyboard

Pictory Jobs

Smart Layouts

Avatars

VoiceOvers

Music Search

Media Management

Branding - Video

Pictory Projects

Video Templates

Video Summary and Transcription

Vimeo Integration

AWS Integration

Add Voiceover Track

Overview

API Endpoint

Request Parameters

Headers

Body Parameters

Response

Response Examples

Code Examples

Usage Notes

Service-Specific Examples

Add Google Cloud Voice

Add AWS Polly Voice

Add ElevenLabs Voice

Batch Add Multiple Voices

Engine and Service Compatibility

Valid Engine Values by Service

Error Handling

Best Practices

Voice Configuration

Service Selection

Voice Management

Getting started

Videos

Video Storyboard

Pictory Jobs

Smart Layouts

Avatars

VoiceOvers

Music Search

Media Management

Branding - Video

Pictory Projects

Video Templates

Video Summary and Transcription

Vimeo Integration

AWS Integration

​Overview

​API Endpoint

​Request Parameters

​Headers

​Body Parameters

​Response

​Response Examples

​Code Examples

​Usage Notes

​Service-Specific Examples

​Add Google Cloud Voice

​Add AWS Polly Voice

​Add ElevenLabs Voice

​Batch Add Multiple Voices

​Engine and Service Compatibility

​Valid Engine Values by Service

​Error Handling

​Best Practices

​Voice Configuration

​Service Selection

​Voice Management

Overview

API Endpoint

Request Parameters

Headers

Body Parameters

Response

Response Examples

Code Examples

Usage Notes

Service-Specific Examples

Add Google Cloud Voice

Add AWS Polly Voice

Add ElevenLabs Voice

Batch Add Multiple Voices

Engine and Service Compatibility

Valid Engine Values by Service

Error Handling

Best Practices

Voice Configuration

Service Selection

Voice Management