Skip to main content
This guide shows you how to create videos with AI-generated images as backgrounds. Instead of using stock visuals, generate unique, custom images based on your text prompts to make your content truly stand out.

What You’ll Learn

AI Visual Generation

Generate unique images with AI instead of stock visuals

Custom Backgrounds

Create visuals from text prompts automatically

Multiple AI Models

Choose from different AI models and media styles

Unique Content

Stand out with one-of-a-kind visuals

Before You Begin

Make sure you have:
  • A Pictory API key (get one here)
  • Node.js or Python installed on your machine
  • An idea of the visual style you want to create
  • Basic understanding of AI image generation concepts
npm install axios

How AI Visual Generation Works

When you use AI-generated visuals in Pictory:
  1. Prompt Processing - Your text prompt is analyzed to understand the desired image
  2. AI Generation - The selected AI model creates a unique image based on your prompt
  3. Style Application - Media style preferences are applied to match your brand
  4. Video Integration - The generated image is seamlessly integrated as your scene background
  5. Automatic or Manual - Use custom prompts or let AI generate prompts from your story
AI visual generation takes longer than using stock visuals because each image is created specifically for your video. Plan for additional processing time when using this feature.

Complete Example

import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const API_KEY = "YOUR_API_KEY";

const STORY_TEXT = "AI is poised to significantly impact educators and course creators on social media.";

async function createTextToVideoWithAIVisual() {
  try {
    console.log("Creating video with AI-generated visuals...");

    const response = await axios.post(
      `${API_BASE_URL}/v2/video/storyboard/render`,
      {
        videoName: "text_to_video_with_ai_visual",
        scenes: [
          {
            story: STORY_TEXT,
            createSceneOnNewLine: false,
            createSceneOnEndOfSentence: false,

            // AI visual configuration
            background: {
              type: "image",                  // Must be "image" for AI visuals
              aiVisual: {
                // Custom prompt describing the desired visual
                prompt: "Futuristic classroom with holographic displays and AI assistants helping students learn",
                model: "flux-schnell",        // AI model for image generation
                mediaStyle: "futuristic",     // Visual style to apply
              },
            },
          },
        ],
      },
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: API_KEY,
        },
      }
    );

    const jobId = response.data.data.jobId;
    console.log("✓ Video creation started!");
    console.log("Job ID:", jobId);

    // Monitor progress
    console.log("\nMonitoring video creation...");
    let jobCompleted = false;
    let jobResult = null;

    while (!jobCompleted) {
      const statusResponse = await axios.get(
        `${API_BASE_URL}/v1/jobs/${jobId}`,
        {
          headers: { Authorization: API_KEY },
        }
      );

      const status = statusResponse.data.data.status;
      console.log("Status:", status);

      if (status === "completed") {
        jobCompleted = true;
        jobResult = statusResponse.data;
        console.log("\n✓ Video with AI-generated visuals is ready!");
        console.log("Video URL:", jobResult.data.videoURL);
      } else if (status === "failed") {
        throw new Error("Video creation failed: " + JSON.stringify(statusResponse.data));
      }

      await new Promise(resolve => setTimeout(resolve, 5000));
    }

    return jobResult;
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}

createTextToVideoWithAIVisual();

Understanding the Configuration

Background Object

ParameterTypeRequiredDescription
typestringYesMust be set to "image" for AI-generated visuals
aiVisualobjectYesConfiguration for AI visual generation
Mutually Exclusive Options: You can only use one background type per scene. Choose either visualUrl (stock/custom video), color (solid color), or aiVisual (AI-generated image).

AI Visual Parameters

ParameterTypeRequiredDescription
promptstringNoText description of the visual to generate (max 250 characters). If omitted, AI generates prompts from your story
modelstringYesThe AI model to use for image generation
mediaStylestringNoThe visual style to apply to the generated image

Available AI Models

ModelSpeedQualityBest Used For
flux-schnellFastGoodQuick iterations, testing, drafts, time-sensitive content
seedream3.0MediumBalancedGeneral-purpose content, versatile applications
nanobananaMediumSpecializedSpecific artistic styles, branded content
titanSlowerExcellentFinal production, high-quality marketing, premium content
Model Selection Strategy: Start with flux-schnell for testing and iteration. Once you’re happy with the prompt and style, switch to titan for your final production video.

Available Media Styles

StyleVisual CharacteristicsBest Used For
photorealisticRealistic photographs, natural lightingCorporate videos, professional presentations, realistic scenarios
artisticArtistic renderings, painterly effectsCreative content, brand storytelling, abstract concepts
cartoonCartoon-style images, bold colorsChildren’s content, educational videos, fun marketing
minimalistSimple, clean designs, reduced detailsModern branding, tech content, professional minimalism
vintageRetro aesthetic, aged appearanceNostalgia marketing, historical content, unique branding
futuristicModern, sci-fi look, high-tech feelTechnology content, innovation topics, forward-thinking brands

Auto-Generated Prompts

You can omit the prompt field to let the AI automatically generate prompts based on your scene’s story text:
background: {
  type: "image",
  aiVisual: {
    model: "flux-schnell",
    mediaStyle: "photorealistic"
    // No prompt specified - AI generates prompts from story text
  }
}
When to Use Auto-Generated Prompts:
  • Creating videos with multiple scenes (createSceneOnEndOfSentence: true)
  • You want the AI to interpret your story text visually
  • Quick content creation without manual prompt writing
  • Testing different visual interpretations

Common Use Cases

Technology and Innovation Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Modern data center with glowing servers and holographic network visualization",
    model: "titan",
    mediaStyle: "futuristic"
  }
}
Result: High-tech, professional visual perfect for technology presentations.

Educational and Learning Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Colorful science laboratory with diverse students conducting experiments",
    model: "seedream3.0",
    mediaStyle: "cartoon"
  }
}
Result: Engaging, friendly visual ideal for educational content.

Marketing and Brand Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Elegant minimalist workspace with natural light and modern laptop",
    model: "titan",
    mediaStyle: "minimalist"
  }
}
Result: Clean, professional visual for brand storytelling.

Historical or Vintage Content

background: {
  type: "image",
  aiVisual: {
    prompt: "1950s style diner with chrome details and neon signs",
    model: "titan",
    mediaStyle: "vintage"
  }
}
Result: Nostalgic, retro visual for period-specific content.

Best Practices

Craft prompts that generate the visuals you need:
  • Be specific: Include details about composition, lighting, and mood
  • Use descriptive language: “bright morning sunlight” vs “sunny”
  • Mention key elements: “modern office with glass walls and city view”
  • Keep under 250 characters: Concise prompts work better
  • Avoid negatives: Say what you want, not what you don’t want
Good Example: “Professional business meeting in modern conference room with natural light”Poor Example: “Not a dark room, people talking, no clutter”
Match your AI model to your use case:
  • Testing phase: Use flux-schnell for quick iterations
  • Production: Switch to titan for final, high-quality output
  • General content: seedream3.0 offers good balance
  • Budget conscious: flux-schnell provides faster, more economical results
Remember: Higher quality models take longer to generate but produce better results.
Match style to your brand and content:
  • Corporate/Professional: Use photorealistic or minimalist
  • Creative/Artistic: Try artistic or vintage
  • Tech/Innovation: Go with futuristic
  • Educational/Fun: Consider cartoon
  • Consistent branding: Stick to one style across your videos
AI visual generation takes longer than stock visuals:
  • Expect 5-15 minutes for generation depending on model
  • flux-schnell is fastest (2-5 minutes per image)
  • titan takes longer but produces premium quality (10-15 minutes)
  • Test with faster models before final production
  • Don’t use AI generation for urgent, time-critical videos
Perfect your visuals through iteration:
  • Start with auto-generated prompts to see AI interpretation
  • Refine prompts based on initial results
  • Test different models and styles
  • Save successful prompts for reuse
  • Document what works for your brand

Troubleshooting

Problem: The AI created an image that doesn’t reflect your prompt description.Solution:
  • Make your prompt more specific and detailed
  • Add descriptive adjectives (e.g., “bright”, “modern”, “spacious”)
  • Specify composition: “aerial view”, “close-up”, “wide angle”
  • Include lighting details: “sunset lighting”, “studio lighting”
  • Try a different AI model - some interpret prompts differently
  • Example revision:
    • Before: “office”
    • After: “modern open-plan office with glass walls, natural daylight, and minimalist furniture”
Problem: Generated images look blurry, pixelated, or low quality.Solution:
  • Switch to a higher quality model:
    • Replace flux-schnell with titan
    • Use seedream3.0 for balanced quality
  • Avoid very complex prompts (keep under 250 characters)
  • Try a different media style that suits the content better
  • Ensure your prompt describes a clear, achievable visual
Problem: Job status shows “in-progress” for extended periods.Solution:
  • AI visual generation is slower than stock visuals - this is normal
  • Expected times:
    • flux-schnell: 5-8 minutes
    • seedream3.0: 8-12 minutes
    • titan: 12-20 minutes
  • Multiple scenes with AI visuals will multiply processing time
  • Consider using AI visuals only for key scenes
  • Use stock visuals for time-sensitive projects
Problem: API returns an error about background settings.Solution: Check your background configuration:
// ✓ Correct
background: {
  type: "image",
  aiVisual: {
    model: "flux-schnell",
    mediaStyle: "photorealistic"
  }
}

// ✗ Wrong - missing type
background: {
  aiVisual: { model: "flux-schnell" }
}

// ✗ Wrong - mixing background types
background: {
  type: "image",
  visualUrl: "https://...",  // Don't mix with aiVisual
  aiVisual: { model: "flux-schnell" }
}
Problem: The image contains elements you didn’t want.Solution:
  • Be more specific about what you want to see
  • Add constraints to your prompt: “simple”, “minimal”, “clean”
  • Avoid ambiguous language that AI might misinterpret
  • Test prompts iteratively to refine results
  • Focus prompts on primary subject matter only

Next Steps

Enhance your AI-generated visual videos with these features:

API Reference

For complete technical details, see: