Text to Video with AI-Generated Image Background

This example demonstrates how to create a video with AI-generated images as backgrounds instead of stock visuals. This allows you to create custom, unique visuals based on text prompts.

Overview

This example covers:

  • Getting an access token
  • Creating a video with AI-generated visual backgrounds
  • Configuring AI visual generation parameters
  • Understanding different AI models and media styles
  • Monitoring job status and retrieving the final video

Node.js Example

Prerequisites

npm install axios

Complete Code

import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const CLIENT_ID = "YOUR_CLIENT_ID";
const CLIENT_SECRET = "YOUR_CLIENT_SECRET";

const STORY_TEXT = "AI is poised to significantly impact educators and course creators on social media.";

async function createTextToVideoWithAIVisual() {
  try {
    // Step 1: Get Access Token
    console.log("Step 1: Getting access token...");
    const tokenResponse = await axios.post(
      `${API_BASE_URL}/v1/oauth2/token`,
      {
        client_id: CLIENT_ID,
        client_secret: CLIENT_SECRET,
      },
      {
        headers: {
          "Content-Type": "application/json",
        },
      }
    );

    const accessToken = tokenResponse.data.access_token;
    console.log("Access token obtained successfully");
    console.log("Token expires in:", tokenResponse.data.expires_in, "seconds\n");

    // Step 2: Create Video Storyboard with AI-Generated Visual
    console.log("Step 2: Creating video storyboard with AI-generated visual...");
    const storyboardResponse = await axios.post(
      `${API_BASE_URL}/v2/video/storyboard/render`,
      {
        videoName: "text_to_video_with_ai_visual",
        scenes: [
          {
            story: STORY_TEXT,
            createSceneOnNewLine: false,
            createSceneOnEndOfSentence: false,
            background: {
              type: "image",
              aiVisual: {
                prompt: "Futuristic classroom with holographic displays and AI assistants helping students learn",
                model: "flux-schnell",
                mediaStyle: "futuristic",
              },
            },
          },
        ],
      },
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: accessToken,
        },
      }
    );

    const renderJobId = storyboardResponse.data.data.jobId;
    console.log("Storyboard render job created with AI-generated visual");
    console.log("Job ID:", renderJobId, "\n");

    // Step 3: Monitor Job Status
    console.log("Step 3: Monitoring job status...");
    let jobCompleted = false;
    let jobResult = null;

    while (!jobCompleted) {
      const jobStatusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${renderJobId}`, {
        headers: {
          Authorization: accessToken,
        },
      });

      const status = jobStatusResponse.data.data.status;
      console.log("Current status:", status);

      if (status === "completed") {
        jobCompleted = true;
        jobResult = jobStatusResponse.data;
        console.log("\nVideo with AI-generated visual created successfully!");
        console.log("Video URL:", jobResult.data.videoUrl);
      } else if (status === "failed") {
        throw new Error("Job failed: " + JSON.stringify(jobStatusResponse.data));
      } else {
        // Wait 5 seconds before checking again
        await new Promise(resolve => setTimeout(resolve, 5000));
      }
    }

    return jobResult;
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}

// Run the function
createTextToVideoWithAIVisual();

Python Example

Prerequisites

pip install requests

Complete Code

import requests
import time
import json

API_BASE_URL = 'https://api.pictory.ai/pictoryapis'
CLIENT_ID = 'YOUR_CLIENT_ID'
CLIENT_SECRET = 'YOUR_CLIENT_SECRET'

STORY_TEXT = "AI is poised to significantly impact educators and course creators on social media."

def create_text_to_video_with_ai_visual():
    try:
        # Step 1: Get Access Token
        print('Step 1: Getting access token...')
        token_response = requests.post(
            f'{API_BASE_URL}/v1/oauth2/token',
            json={
                'client_id': CLIENT_ID,
                'client_secret': CLIENT_SECRET
            },
            headers={
                'Content-Type': 'application/json'
            }
        )
        token_response.raise_for_status()

        access_token = token_response.json()['access_token']
        print('Access token obtained successfully')
        print(f"Token expires in: {token_response.json()['expires_in']} seconds\n")

        # Step 2: Create Video Storyboard with AI-Generated Visual
        print('Step 2: Creating video storyboard with AI-generated visual...')
        storyboard_response = requests.post(
            f'{API_BASE_URL}/v2/video/storyboard/render',
            json={
                'videoName': 'text_to_video_with_ai_visual',
                'scenes': [
                    {
                        'story': STORY_TEXT,
                        'createSceneOnNewLine': False,
                        'createSceneOnEndOfSentence': False,
                        'background': {
                            'type': 'image',
                            'aiVisual': {
                                'prompt': 'Futuristic classroom with holographic displays and AI assistants helping students learn',
                                'model': 'flux-schnell',
                                'mediaStyle': 'futuristic'
                            }
                        }
                    }
                ]
            },
            headers={
                'Content-Type': 'application/json',
                'Authorization': access_token
            }
        )
        storyboard_response.raise_for_status()

        render_job_id = storyboard_response.json()['data']['jobId']
        print('Storyboard render job created with AI-generated visual')
        print(f'Job ID: {render_job_id}\n')

        # Step 3: Monitor Job Status
        print('Step 3: Monitoring job status...')
        job_completed = False
        job_result = None

        while not job_completed:
            job_status_response = requests.get(
                f'{API_BASE_URL}/v1/jobs/{render_job_id}',
                headers={
                    'Authorization': access_token
                }
            )
            job_status_response.raise_for_status()

            status = job_status_response.json()['data']['status']
            print(f'Current status: {status}')

            if status == 'completed':
                job_completed = True
                job_result = job_status_response.json()
                print('\nVideo with AI-generated visual created successfully!')
                print(f"Video URL: {job_result['data']['videoUrl']}")
            elif status == 'failed':
                raise Exception(f"Job failed: {json.dumps(job_status_response.json())}")
            else:
                # Wait 5 seconds before checking again
                time.sleep(5)

        return job_result

    except requests.exceptions.RequestException as error:
        print(f'Error: {error}')
        if hasattr(error, 'response') and error.response is not None:
            print(f'Response: {error.response.text}')
        raise

# Run the function
if __name__ == '__main__':
    create_text_to_video_with_ai_visual()

Key Parameters

Background Object

  • type: Must be set to "image" for AI-generated visuals
  • aiVisual: Configuration for AI visual generation

AI Visual Configuration

  • prompt (optional): Text description of the visual to generate (max 250 characters)
    • If not provided, the system will automatically generate a prompt based on the scene's story text
    • When createSceneOnEndOfSentence is true, prompts are auto-generated for each individual scene
    • Auto-generated prompts are optimized to match the content and context of each scene
  • model (required): The AI model to use for generation
    • "seedream3.0" - Balanced quality and speed
    • "flux-schnell" - Fast generation
    • "nanobanana" - Optimized for specific styles
    • "titan" - High-quality generation
  • mediaStyle (optional): The visual style to apply
    • "photorealistic" - Realistic photographs
    • "artistic" - Artistic renderings
    • "cartoon" - Cartoon-style images
    • "minimalist" - Simple, clean designs
    • "vintage" - Retro aesthetic
    • "futuristic" - Modern, sci-fi look

Best Practices

  1. Prompt Writing:

    • Custom Prompts: Be specific and descriptive in your prompts
      • Include details about lighting, composition, and mood
      • Keep prompts under 250 characters for best results
    • Auto-Generated Prompts: Omit the prompt field to let the system generate prompts automatically
      • Best for content-driven videos where visuals should match the story
      • Ensures consistency when createSceneOnEndOfSentence is true
      • Saves time and leverages AI to interpret your content
  2. Model Selection:

    • Use flux-schnell for fast iterations and testing
    • Use titan for final production videos requiring highest quality
    • Use seedream3.0 for balanced results
  3. Media Style:

    • Choose a style that matches your video's tone
    • Be consistent across scenes for visual coherence

Auto-Generated Prompts Example

You can omit the prompt field to let the system automatically generate prompts based on your content:

background: {
  type: "image",
  aiVisual: {
    model: "flux-schnell",
    mediaStyle: "photorealistic"
    // No prompt specified - system will auto-generate based on scene story
  }
}

This is particularly useful when:

  • Creating videos with multiple scenes (createSceneOnEndOfSentence: true)
  • You want visuals that precisely match each sentence/scene
  • You prefer to let AI interpret the content for optimal visual generation

Response

The API returns a job ID for monitoring the video creation progress. Once completed, you'll receive a video URL with AI-generated custom visuals.

Notes

  • Replace YOUR_CLIENT_ID and YOUR_CLIENT_SECRET with your actual API credentials
  • AI visual generation may take longer than using stock visuals
  • The background type must be "image" when using aiVisual
  • Only one of visualUrl, color, or aiVisual can be specified per background
  • The prompt field is optional - omit it to enable automatic prompt generation from scene content