Skip to main content
This guide shows you how to use AI-generated images as scene backgrounds in your videos. Instead of stock visuals, generate custom images from text prompts using a variety of AI image models — each with different quality levels and AI credit costs.

What You’ll Learn

AI Image Generation

Generate unique images with AI instead of stock visuals

Image Models

Choose from Flux, Seedream, Nano Banana, and Nano Banana Pro

Media Styles

Apply styles like photorealistic, artistic, cartoon, and more

AI Credit Costs

Understand per-image credit costs for each model

Before You Begin

Make sure you have:
  • A Pictory API key (get one here)
  • Node.js or Python installed on your machine
  • Sufficient AI credits in your account
  • Basic understanding of AI image generation concepts
npm install axios

How It Works

When you set background.type to "image" and provide an aiVisual configuration, Pictory generates a unique AI image for the scene background:
  1. Prompt Processing — Your text prompt (or auto-generated prompt from story text) is analyzed
  2. Style Application — The selected mediaStyle is applied to shape the visual output
  3. AI Generation — The chosen image model creates a unique image
  4. Scene Integration — The generated image is used as the scene background
AI image generation takes additional processing time compared to stock visuals. The time varies by model — faster models like Flux generate in seconds, while higher-quality models take longer.

Configuration Reference

Background Object

When using AI-generated images, set the background object on a scene as follows:
ParameterTypeRequiredDescription
typestringYesMust be "image" for AI-generated images
aiVisualobjectYesAI image generation configuration (see below)
Mutually Exclusive: The background object can only have one of visualUrl, color, or aiVisual. You cannot combine them in the same scene.

aiVisual Parameters

ParameterTypeRequiredDescription
promptstringNoText description of the image to generate (max 250 characters). If omitted, a prompt is auto-generated from the scene’s story text.
modelstringYesThe AI image model to use. See Available Image Models.
mediaStylestringNoVisual style to apply. See Available Media Styles.
The videoDuration parameter is not allowed when type is "image". It is only used for AI-generated video clips.

Available Image Models

Each model has different strengths and AI credit costs. Choose based on your quality requirements and budget.
Model IDDisplay NameAI Credits (per image)Best ForSupported Aspect Ratios
flux-schnellFlux0.6Reliable for basic layouts1:1, 16:9, 9:16
seedream3.0Seedream2Reliable for text and numbers1:1, 16:9, 9:16
nanobananaNano Banana4Excels at details1:1, 16:9, 9:16
nanobanana-proNano Banana Pro14Superior cinematic quality1:1, 16:9, 9:16
Model Selection Strategy:
  • Use flux-schnell (0.6 credits) for quick iterations, testing, and drafts
  • Use seedream3.0 (2 credits) when your image includes text or numbers
  • Use nanobanana (4 credits) when you need fine detail and precision
  • Use nanobanana-pro (14 credits) for premium, cinematic-quality final output

Available Media Styles

The mediaStyle parameter shapes the look and feel of the generated image. It is optional — if omitted, the model uses its default rendering style.
StyleVisual CharacteristicsBest Used For
photorealisticRealistic photographs, natural lightingCorporate videos, professional presentations, realistic scenarios
artisticArtistic renderings, painterly effectsCreative content, brand storytelling, abstract concepts
cartoonCartoon-style imagery, bold colorsChildren’s content, educational videos, fun marketing
minimalistSimple, clean designs, reduced detailsModern branding, tech content, professional minimalism
vintageRetro aesthetic, aged appearanceNostalgia marketing, historical content, unique branding
futuristicModern, sci-fi look, high-tech feelTechnology content, innovation topics, forward-thinking brands

Complete Example

import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const API_KEY = "YOUR_API_KEY";

async function createVideoWithAIBackgroundImages() {
  try {
    console.log("Creating video with AI-generated background images...");

    const response = await axios.post(
      `${API_BASE_URL}/v2/video/storyboard/render`,
      {
        videoName: "ai_background_images_demo",
        scenes: [
          {
            story: "AI is transforming how we create visual content for marketing and education.",
            createSceneOnEndOfSentence: false,
            background: {
              type: "image",
              aiVisual: {
                prompt: "Modern creative studio with holographic displays showing marketing dashboards",
                model: "seedream3.0",
                mediaStyle: "futuristic",
              },
            },
          },
          {
            story: "With the right tools, anyone can produce professional-quality videos in minutes.",
            createSceneOnEndOfSentence: false,
            background: {
              type: "image",
              aiVisual: {
                prompt: "Professional video editing workspace with multiple monitors and warm lighting",
                model: "nanobanana",
                mediaStyle: "photorealistic",
              },
            },
          },
        ],
      },
      {
        headers: {
          "Content-Type": "application/json",
          Authorization: API_KEY,
        },
      }
    );

    const jobId = response.data.data.jobId;
    console.log("Video creation started! Job ID:", jobId);

    // Poll for completion
    let jobCompleted = false;
    while (!jobCompleted) {
      const statusResponse = await axios.get(
        `${API_BASE_URL}/v1/jobs/${jobId}`,
        { headers: { Authorization: API_KEY } }
      );

      const status = statusResponse.data.data.status;
      console.log("Status:", status);

      if (status === "completed") {
        jobCompleted = true;
        console.log("Video URL:", statusResponse.data.data.videoURL);
      } else if (status === "failed") {
        throw new Error("Video creation failed: " + JSON.stringify(statusResponse.data));
      }

      await new Promise((resolve) => setTimeout(resolve, 5000));
    }
  } catch (error) {
    console.error("Error:", error.response?.data || error.message);
    throw error;
  }
}

createVideoWithAIBackgroundImages();

Auto-Generated Prompts

You can omit the prompt field to let the AI automatically generate a prompt based on your scene’s story text:
background: {
  type: "image",
  aiVisual: {
    model: "flux-schnell",
    mediaStyle: "photorealistic"
    // No prompt — AI generates one from story text
  }
}
This is useful when:
  • Creating multi-scene videos with createSceneOnEndOfSentence: true
  • You want the AI to visually interpret your story text
  • Quick content creation without manual prompt writing

Common Use Cases

Technology Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Modern data center with glowing servers and network visualization",
    model: "nanobanana",
    mediaStyle: "futuristic"
  }
}
AI Credits: 4 per scene

Educational Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Colorful science laboratory with students conducting experiments",
    model: "seedream3.0",
    mediaStyle: "cartoon"
  }
}
AI Credits: 2 per scene

Marketing Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Elegant minimalist workspace with natural light and modern laptop",
    model: "nanobanana-pro",
    mediaStyle: "minimalist"
  }
}
AI Credits: 14 per scene

Budget-Friendly Content

background: {
  type: "image",
  aiVisual: {
    prompt: "Professional office environment with clean modern design",
    model: "flux-schnell",
    mediaStyle: "photorealistic"
  }
}
AI Credits: 0.6 per scene

Best Practices

  • Be specific: Include details about composition, lighting, and mood
  • Use descriptive language: “bright morning sunlight streaming through glass walls” vs “sunny room”
  • Mention key elements: “modern office with glass walls and city skyline view”
  • Keep under 250 characters: Concise prompts produce more focused results
  • Avoid negatives: Describe what you want, not what you don’t want
Good: “Professional business meeting in modern conference room with natural light and city view”Poor: “Not a dark room, people talking, no clutter”
ScenarioRecommended ModelCost
Testing & draftsflux-schnell0.6 credits
General contentseedream3.02 credits
Detail-rich scenesnanobanana4 credits
Premium final outputnanobanana-pro14 credits
Start with flux-schnell for iteration, then switch to a higher-quality model for production.
  • Corporate/Professional: photorealistic or minimalist
  • Creative/Artistic: artistic or vintage
  • Tech/Innovation: futuristic
  • Educational/Fun: cartoon
  • Consistent branding: Stick to one style across scenes
If your image needs to display readable text or numbers, use seedream3.0 — it is specifically optimized for rendering text and numbers clearly within generated images.

Troubleshooting

  • Make your prompt more specific and detailed
  • Add descriptive adjectives: “bright”, “modern”, “spacious”
  • Specify composition: “aerial view”, “close-up”, “wide angle”
  • Include lighting details: “sunset lighting”, “studio lighting”
  • Try a different model — each interprets prompts differently
  • Switch to a higher-quality model (nanobanana or nanobanana-pro)
  • Avoid overly complex prompts — keep them focused
  • Try a different mediaStyle that suits the content
Ensure your configuration follows these rules:
// Correct
background: {
  type: "image",
  aiVisual: {
    model: "flux-schnell",
    mediaStyle: "photorealistic"
  }
}

// Wrong — missing type
background: {
  aiVisual: { model: "flux-schnell" }
}

// Wrong — mixing background types
background: {
  type: "image",
  visualUrl: "https://...",
  aiVisual: { model: "flux-schnell" }
}

// Wrong — videoDuration not allowed for images
background: {
  type: "image",
  aiVisual: {
    model: "flux-schnell",
    videoDuration: "5s"
  }
}
Each image generation costs AI credits based on the model used. Check your credit balance and consider using a more economical model like flux-schnell (0.6 credits per image).

Next Steps

API Reference