Add a new voiceover track

The Add new Voiceover API allows users to add Text-to-Speech (TTS) voiceovers from external platforms like Google, AWS, and ElevenLabs to Pictory’s voiceover library. By default, Pictory provides access to a pre-configured list of voices from these platforms. However, users may encounter voiceovers available on these platforms that are not present in Pictory’s default voiceover library. This API enables users to add such custom voices to the Pictory platform by providing the relevant details.

API Endpoint

URL:
https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks

Method:
POST

Content-Type:
application/json

Accept:
application/json

Request Structure

CURL Command:

curl --request POST \
     --url https://api.pictory.ai/pictoryapis/v1/voiceovers/tracks \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --data '
{
  "engine": "WaveNet",
  "service": "google",
  "name": "Arnold",
  "description": "A deep male voice",
  "language": "en-US",
  "voiceId": "en-US-Journey-D",
  "accent": "American",
  "age": "Adult",
  "gender": "Male",
  "sample": "https://example.com/sample.mp3"
}'

Request Body Parameters:

Field

Type

Required

Description

name

string

Yes

Unique name of the voiceover.

description

string

No

Description of the voiceover.

engine

string

Yes

The TTS engine to use. Allowed values: neural, standard, WaveNet, Neural2, eleven_monolingual_v1, eleven_multilingual_v2.

For aws service, the valid values are neural and standard

For google service, the valid values are standard, WaveNet and Neural2

For elevenlabs service, the valid values are eleven_monolingual_v1 and eleven_multilingual_v2

service

string

Yes

The TTS provider service. Allowed values: aws, google, elevenlabs.

language

string

Yes

The language code of the voice. Example: en-US.

publicUserId

string

No

Public user ID used to publicly identify ElevenLabs users. It is required for sharing a voice from user's ElevenLabs account to Pictory's account.
It is valid for elevenlabs service only.

voiceId

string

Yes

Unique identifier for the voice from the external TTS provider.

For google, the voiceId could be voice name en-US-Journey-D

For aws, the voiceId could be voice name `Danielle

For elevenlabs, the voiceId could be unique Id of the voices in the voice library

accent

string

Yes

The accent of the voice, e.g., American.

age

string

Yes

The age group of the voice. Typical values include Child, Teen, Adult, Senior.

gender

string

Yes

The gender of the voice. Typical values include Male, Female, Neutral.

sample

URL

No

A URL to an audio sample for this voice.

elevenlabsVoiceSettings

Object
(Elevenlabs Voice Settings)

No

Settings specific to ElevenLabs service. Valid only if service is elevenlabs.

Elevenlabs Voice Settings

FieldTypeRequiredDescription
similarityBoostnumberYesThe similarity boost setting for the voice.
stabilitynumberYesThe stability setting for the voice.
useSpeakerBoostbooleanNoWhether to use speaker boost for the voice.

Example Request Body:

{
    "engine": "WaveNet",
    "service": "google",
    "name": "Arnold",
    "description": "A deep male voice",
    "language": "en-US",
    "voiceId": "en-US-Journey-D",
    "accent": "American",
    "age": "Adult",
    "gender": "Male",
    "sample": "https://example.com/sample.mp3"
}

Headers:

HeaderValueDescription
Authorization<access_token>Token for API access.
X-Pictory-User-Id<Your-Pictory-User-ID>Unique identifier for the user provided by Pictory.
content-typeapplication/jsonSpecifies the request payload format.

Example Response

Successful Response:

{
  "id": 12345,
  "name": "Arnold",
  "accent": "American",
  "gender": "Male",
  "language": "en-US",
  "sample": "https://example.com/sample.mp3",
  "service": "google",
  "engine": "WaveNet",
  "category": "standard",
  "ssmlSupportCategory": "C",
  "ssmlHelp": "https://docs.pictory.ai/docs/supported-ssml-tags#category-c"
}

Response Fields:

FieldTypeDescription
idIntegerThe unique identifier for the voiceover track in the Pictory voiceover library.
nameStringThe name of the voiceover track.
accentStringThe accent of the voice (e.g., "American").
genderStringThe gender of the voice (e.g., "Male").
languageStringThe language of the voice (e.g., "en-US").
sampleURLThe sample URL of the voice track.
serviceStringThe service where the voice is sourced from. Possible values: aws, google, elevenlabs.
engineStringThe voice synthesis engine used. Possible values: neural, standard, WaveNet, Neural2, eleven_monolingual_v1, eleven_multilingual_v2.
categoryStringThe category of the voice. For elevenlabs, it is premium. For google and aws, it is standard.
ssmlSupportCategoryStringThe SSML support category. Possible values: A, B, C, D.
ssmlHelpStringA URL pointing to the Pictory SSML support documentation for the corresponding support category.

Try request here

Language
Click Try It! to start a request and see the response here!