Text to Video with Video-Level and Scene-Level Voice Over

This example demonstrates how to create a video with both video-level default voice-over and custom scene-level voice-over settings. This allows you to use different voice configurations for specific scenes while maintaining a default voice for the rest of the video.

Overview

This example covers:

  • Getting an access token
  • Setting up a default voice-over at the video level
  • Overriding voice-over settings at the scene level
  • Customizing voice speed and amplification
  • Monitoring job status and retrieving the final video

Node.js Example

Prerequisites

npm install axios

Complete Code

import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const CLIENT_ID = "YOUR_CLIENT_ID";
const CLIENT_SECRET = "YOUR_CLIENT_SECRET";

// Sample text for different scenes
const INTRO_TEXT = "AI is poised to significantly impact educators and course creators on social media.";
const MAIN_TEXT =
  "By automating tasks like content generation, visual design, and video editing, AI will save time and enhance consistency.";

async function createVideoWithMultiLevelVoiceOver() {
  try {
    // Step 1: Get Access Token
    console.log("Step 1: Getting access token...");
    const tokenResponse = await axios.post(
      `${API_BASE_URL}/v1/oauth2/token`,
      {
        client_id: CLIENT_ID,
        client_secret: CLIENT_SECRET,
      },
      {
        headers: {
          "Content-Type": "application/json",
        },
      }
    );

    const accessToken = tokenResponse.data.access_token;
    console.log("Access token obtained successfully");
    console.log("Token expires in:", tokenResponse.data.expires_in, "seconds\n");

    // Step 2: Create Video Storyboard with Multi-Level Voice Over
    console.log("Step 2: Creating video storyboard with multi-level voice-over...");
    const storyboardResponse = await axios.post(
      `${API_BASE_URL}/v2/video/storyboard/render`,
      {
        videoName: "multilevel_voiceover_video",
        // Video-level voice-over (default for all scenes)
        voiceOver: {
          enabled: true,
          aiVoices: [
            {
              speaker: "Brian",
              speed: 100,
              amplificationLevel: 0,
            },
          ],
        },
        scenes: [
          {
            story: INTRO_TEXT,
            createSceneOnNewLine: false,
            createSceneOnEndOfSentence: false,
            // Scene-level voice-over override with different settings
            voiceOver: {
              enabled: true,
              aiVoices: [
                {
                  speaker: "Brian",
                  speed: 90, // Slower speed for emphasis
                  amplificationLevel: 0.2, // Slightly louder
                },
              ],
            },
          },
          {
            story: MAIN_TEXT,
            createSceneOnNewLine: false,
            createSceneOnEndOfSentence: false,
            // This scene will use the video-level voice-over settings
          },
        ],
      },
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: accessToken,
        },
      }
    );

    const renderJobId = storyboardResponse.data.data.jobId;
    console.log("Storyboard render job created with multi-level voice-over");
    console.log("Job ID:", renderJobId, "\n");

    // Step 3: Monitor Job Status
    console.log("Step 3: Monitoring job status...");
    let jobCompleted = false;
    let jobResult = null;

    while (!jobCompleted) {
      const jobStatusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${renderJobId}`, {
        headers: {
          Authorization: accessToken,
        },
      });

      const status = jobStatusResponse.data.data.status;
      console.log("Current status:", status);

      if (status === "completed") {
        jobCompleted = true;
        jobResult = jobStatusResponse.data;
        console.log("\nVideo with multi-level voice-over created successfully!");
        console.log("Video URL:", jobResult.data.videoURL);
      } else if (status === "failed") {
        throw new Error("Job failed: " + JSON.stringify(jobStatusResponse.data));
      } else {
        // Wait 5 seconds before checking again
        await new Promise(resolve => setTimeout(resolve, 5000));
      }
    }

    return jobResult;
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}

// Run the function
createVideoWithMultiLevelVoiceOver();

Python Example

Prerequisites

pip install requests

Complete Code

import requests
import time
import json

API_BASE_URL = 'https://api.pictory.ai/pictoryapis'
CLIENT_ID = 'YOUR_CLIENT_ID'
CLIENT_SECRET = 'YOUR_CLIENT_SECRET'

# Sample text for different scenes
INTRO_TEXT = "AI is poised to significantly impact educators and course creators on social media."
MAIN_TEXT = "By automating tasks like content generation, visual design, and video editing, AI will save time and enhance consistency."

def create_video_with_multilevel_voiceover():
    try:
        # Step 1: Get Access Token
        print('Step 1: Getting access token...')
        token_response = requests.post(
            f'{API_BASE_URL}/v1/oauth2/token',
            json={
                'client_id': CLIENT_ID,
                'client_secret': CLIENT_SECRET
            },
            headers={
                'Content-Type': 'application/json'
            }
        )
        token_response.raise_for_status()

        access_token = token_response.json()['access_token']
        print('Access token obtained successfully')
        print(f"Token expires in: {token_response.json()['expires_in']} seconds\n")

        # Step 2: Create Video Storyboard with Multi-Level Voice Over
        print('Step 2: Creating video storyboard with multi-level voice-over...')
        storyboard_response = requests.post(
            f'{API_BASE_URL}/v2/video/storyboard/render',
            json={
                'videoName': 'multilevel_voiceover_video',
                # Video-level voice-over (default for all scenes)
                'voiceOver': {
                    'enabled': True,
                    'aiVoices': [
                        {
                            'speaker': 'Brian',
                            'speed': 100,
                            'amplificationLevel': 0
                        }
                    ]
                },
                'scenes': [
                    {
                        'story': INTRO_TEXT,
                        'createSceneOnNewLine': False,
                        'createSceneOnEndOfSentence': False,
                        # Scene-level voice-over override with different settings
                        'voiceOver': {
                            'enabled': True,
                            'aiVoices': [
                                {
                                    'speaker': 'Brian',
                                    'speed': 90,  # Slower speed for emphasis
                                    'amplificationLevel': 0.2  # Slightly louder
                                }
                            ]
                        }
                    },
                    {
                        'story': MAIN_TEXT,
                        'createSceneOnNewLine': False,
                        'createSceneOnEndOfSentence': False
                        # This scene will use the video-level voice-over settings
                    }
                ]
            },
            headers={
                'Content-Type': 'application/json',
                'Authorization': access_token
            }
        )
        storyboard_response.raise_for_status()

        render_job_id = storyboard_response.json()['data']['jobId']
        print('Storyboard render job created with multi-level voice-over')
        print(f'Job ID: {render_job_id}\n')

        # Step 3: Monitor Job Status
        print('Step 3: Monitoring job status...')
        job_completed = False
        job_result = None

        while not job_completed:
            job_status_response = requests.get(
                f'{API_BASE_URL}/v1/jobs/{render_job_id}',
                headers={
                    'Authorization': access_token
                }
            )
            job_status_response.raise_for_status()

            status = job_status_response.json()['data']['status']
            print(f'Current status: {status}')

            if status == 'completed':
                job_completed = True
                job_result = job_status_response.json()
                print('\nVideo with multi-level voice-over created successfully!')
                print(f"Video URL: {job_result['data']['videoURL']}")
            elif status == 'failed':
                raise Exception(f"Job failed: {json.dumps(job_status_response.json())}")
            else:
                # Wait 5 seconds before checking again
                time.sleep(5)

        return job_result

    except requests.exceptions.RequestException as error:
        print(f'Error: {error}')
        if hasattr(error, 'response') and error.response is not None:
            print(f'Response: {error.response.text}')
        raise

# Run the function
if __name__ == '__main__':
    create_video_with_multilevel_voiceover()

Key Parameters

Video-Level Voice Over

  • voiceOver: Default voice-over configuration for all scenes
    • enabled: Set to true to enable voice-over
    • aiVoices: Array of AI voice configurations
      • speaker: The AI voice name (e.g., "Brian")
      • speed: Voice speed (50-200, default: 100)
      • amplificationLevel: Volume level (-1 to 1, default: 0)

Scene-Level Voice Over

  • scenes[].voiceOver: Override voice-over settings for a specific scene
    • Same structure as video-level voiceOver
    • Takes precedence over video-level settings for that scene
    • Allows customization per scene

Use Cases

This multi-level voice-over approach is useful for:

  • Emphasizing important sections with different voice speeds
  • Adjusting volume for specific scenes
  • Creating variety in longer videos
  • Highlighting key information with voice variations

Response

The API returns a job ID for monitoring the video creation progress. Once completed, you'll receive a video URL with voice-over applied according to your video-level and scene-level configurations.

Notes

  • Replace YOUR_CLIENT_ID and YOUR_CLIENT_SECRET with your actual API credentials
  • Scene-level voice-over settings override video-level settings for that specific scene
  • Scenes without scene-level voice-over will use the video-level settings
  • Speed values range from 50 (very slow) to 200 (very fast)
  • AmplificationLevel values range from -1 (quieter) to 1 (louder)