Text to Video with AI Voice Over

This example demonstrates how to create a video from text with an AI-generated voice-over narration using the Pictory API.

Overview

This example covers:

  • Getting an access token
  • Creating a video with AI voice-over
  • Using the "Brian" AI voice speaker
  • Monitoring job status and retrieving the final video

Node.js Example

Prerequisites

npm install axios

Complete Code

import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const CLIENT_ID = "YOUR_CLIENT_ID";
const CLIENT_SECRET = "YOUR_CLIENT_SECRET";

// Sample text for the video
const SAMPLE_TEXT =
  "AI is poised to significantly impact educators and course creators on social media. By automating tasks like content generation, visual design, and video editing, AI will save time and enhance consistency. This allows creators to focus on higher-level strategies and ensures a cohesive brand presence. Personalization is another key benefit. AI can analyze social media interactions to tailor content to individual learner needs, making learning more engaging and effective. This level of personalization, previously difficult to achieve at scale, will become more accessible. AI also offers advanced analytics, providing insights into content performance and audience engagement. Creators can quickly refine their strategies based on real-time data, ensuring their content remains relevant and impactful. However, the rise of AI brings challenges. There's a risk that over-reliance on AI-generated content could diminish authenticity. Automated content might lack the personal touch that makes learning experiences unique. Educators will need to balance AI efficiency with maintaining the human element in their content. Ultimately, AI will be a powerful tool for those who embrace it. Educators and course creators who adapt will likely expand their reach, improve content quality, and engage more effectively with their audiences on social media.";

async function createTextToVideoWithVoiceOver() {
  try {
    // Step 1: Get Access Token
    console.log("Step 1: Getting access token...");
    const tokenResponse = await axios.post(
      `${API_BASE_URL}/v1/oauth2/token`,
      {
        client_id: CLIENT_ID,
        client_secret: CLIENT_SECRET,
      },
      {
        headers: {
          "Content-Type": "application/json",
        },
      }
    );

    const accessToken = tokenResponse.data.access_token;
    console.log("Access token obtained successfully");
    console.log("Token expires in:", tokenResponse.data.expires_in, "seconds\n");

    // Step 2: Create Video Storyboard with AI Voice Over
    console.log("Step 2: Creating video storyboard with AI voice-over...");
    const storyboardResponse = await axios.post(
      `${API_BASE_URL}/v2/video/storyboard/render`,
      {
        videoName: "text_to_video_with_ai_voice",
        voiceOver: {
          enabled: true,
          aiVoices: [
            {
              speaker: "Brian",
            },
          ],
        },
        scenes: [
          {
            story: SAMPLE_TEXT,
            createSceneOnNewLine: true,
            createSceneOnEndOfSentence: true,
          },
        ],
      },
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: accessToken,
        },
      }
    );

    const renderJobId = storyboardResponse.data.data.jobId;
    console.log("Storyboard render job created with AI voice-over");
    console.log("Job ID:", renderJobId, "\n");

    // Step 3: Monitor Job Status
    console.log("Step 3: Monitoring job status...");
    let jobCompleted = false;
    let jobResult = null;

    while (!jobCompleted) {
      const jobStatusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${renderJobId}`, {
        headers: {
          Authorization: accessToken,
        },
      });

      const status = jobStatusResponse.data.data.status;
      console.log("Current status:", status);

      if (status === "completed") {
        jobCompleted = true;
        jobResult = jobStatusResponse.data;
        console.log("\nVideo with AI voice-over created successfully!");
        console.log("Video URL:", jobResult.data.videoUrl);
      } else if (status === "failed") {
        throw new Error("Job failed: " + JSON.stringify(jobStatusResponse.data));
      } else {
        // Wait 5 seconds before checking again
        await new Promise(resolve => setTimeout(resolve, 5000));
      }
    }

    return jobResult;
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}

// Run the function
createTextToVideoWithVoiceOver();

Python Example

Prerequisites

pip install requests

Complete Code

import requests
import time
import json

API_BASE_URL = 'https://api.pictory.ai/pictoryapis'
CLIENT_ID = 'YOUR_CLIENT_ID'
CLIENT_SECRET = 'YOUR_CLIENT_SECRET'

# Sample text for the video
SAMPLE_TEXT = "AI is poised to significantly impact educators and course creators on social media. By automating tasks like content generation, visual design, and video editing, AI will save time and enhance consistency. This allows creators to focus on higher-level strategies and ensures a cohesive brand presence. Personalization is another key benefit. AI can analyze social media interactions to tailor content to individual learner needs, making learning more engaging and effective. This level of personalization, previously difficult to achieve at scale, will become more accessible. AI also offers advanced analytics, providing insights into content performance and audience engagement. Creators can quickly refine their strategies based on real-time data, ensuring their content remains relevant and impactful. However, the rise of AI brings challenges. There's a risk that over-reliance on AI-generated content could diminish authenticity. Automated content might lack the personal touch that makes learning experiences unique. Educators will need to balance AI efficiency with maintaining the human element in their content. Ultimately, AI will be a powerful tool for those who embrace it. Educators and course creators who adapt will likely expand their reach, improve content quality, and engage more effectively with their audiences on social media."

def create_text_to_video_with_voiceover():
    try:
        # Step 1: Get Access Token
        print('Step 1: Getting access token...')
        token_response = requests.post(
            f'{API_BASE_URL}/v1/oauth2/token',
            json={
                'client_id': CLIENT_ID,
                'client_secret': CLIENT_SECRET
            },
            headers={
                'Content-Type': 'application/json'
            }
        )
        token_response.raise_for_status()

        access_token = token_response.json()['access_token']
        print('Access token obtained successfully')
        print(f"Token expires in: {token_response.json()['expires_in']} seconds\n")

        # Step 2: Create Video Storyboard with AI Voice Over
        print('Step 2: Creating video storyboard with AI voice-over...')
        storyboard_response = requests.post(
            f'{API_BASE_URL}/v2/video/storyboard/render',
            json={
                'videoName': 'text_to_video_with_ai_voice',
                'voiceOver': {
                    'enabled': True,
                    'aiVoices': [
                        {
                            'speaker': 'Brian'
                        }
                    ]
                },
                'scenes': [
                    {
                        'story': SAMPLE_TEXT,
                        'createSceneOnNewLine': True,
                        'createSceneOnEndOfSentence': True
                    }
                ]
            },
            headers={
                'Content-Type': 'application/json',
                'Authorization': access_token
            }
        )
        storyboard_response.raise_for_status()

        render_job_id = storyboard_response.json()['data']['jobId']
        print('Storyboard render job created with AI voice-over')
        print(f'Job ID: {render_job_id}\n')

        # Step 3: Monitor Job Status
        print('Step 3: Monitoring job status...')
        job_completed = False
        job_result = None

        while not job_completed:
            job_status_response = requests.get(
                f'{API_BASE_URL}/v1/jobs/{render_job_id}',
                headers={
                    'Authorization': access_token
                }
            )
            job_status_response.raise_for_status()

            status = job_status_response.json()['data']['status']
            print(f'Current status: {status}')

            if status == 'completed':
                job_completed = True
                job_result = job_status_response.json()
                print('\nVideo with AI voice-over created successfully!')
                print(f"Video URL: {job_result['data']['videoUrl']}")
            elif status == 'failed':
                raise Exception(f"Job failed: {json.dumps(job_status_response.json())}")
            else:
                # Wait 5 seconds before checking again
                time.sleep(5)

        return job_result

    except requests.exceptions.RequestException as error:
        print(f'Error: {error}')
        if hasattr(error, 'response') and error.response is not None:
            print(f'Response: {error.response.text}')
        raise

# Run the function
if __name__ == '__main__':
    create_text_to_video_with_voiceover()

Key Parameters

  • voiceOver: Configuration for AI voice-over narration
    • enabled: Set to true to enable voice-over
    • aiVoices: Array of AI voice configurations
      • speaker: The name of the AI voice (e.g., "Brian")
      • speed (optional): Voice speed (50-200, default: 100)
      • amplificationLevel (optional): Volume level (-1 to 1, default: 0)

Available AI Voices

The Pictory API supports various AI voice speakers. In this example, we use "Brian", which provides a natural, professional male voice. You can explore other available voices through the API documentation.

Response

The API returns a job ID for monitoring the video creation progress. Once completed, you'll receive a video URL with the AI voice-over narration synchronized to the visual content.

Notes

  • Replace YOUR_CLIENT_ID and YOUR_CLIENT_SECRET with your actual API credentials
  • The AI voice-over is automatically synchronized with the video scenes
  • You can customize the voice speed and amplification level for different effects
  • Multiple AI voices can be used in a single video by specifying them in the aiVoices array