Skip to main content
This guide shows you how to use the referenceImageUrls field to guide AI video generation with one or two reference images. Unlike firstFrameImageUrl which sets the starting frame, reference images influence the overall style, composition, and visual tone of the generated video clip.

What You Will Learn

Style-Guided Video

Generate video clips that match the style of your reference images

Multiple References

Provide up to two reference images for richer style blending

Creative Control

Combine reference images with prompts for precise video generation

Validation Rules

Understand constraints and model-specific behavior

Before You Begin

Make sure you have:
npm install axios

How It Works

When you provide referenceImageUrls in the aiVisual object:
  1. The AI model analyzes the reference images for style, color palette, composition, and visual tone
  2. Your text prompt describes the motion and subject matter for the video
  3. The model generates a video clip that incorporates the visual characteristics from the reference images
  4. With two reference images, the model blends style elements from both for a richer result
referenceImageUrls is only available when background.type is "video". For image generation, use referenceImageUrl instead.
When using referenceImageUrls with the veo3.1 or veo3.1_fast models, the video duration is automatically set to "8s" regardless of the videoDuration value you provide.

Configuration

Add referenceImageUrls to the aiVisual object with an array of 1–2 valid image URLs:
{
  "background": {
    "type": "video",
    "aiVisual": {
      "prompt": "Dynamic montage of team collaboration in a modern office",
      "model": "pixverse5.5",
      "videoDuration": "8s",
      "referenceImageUrls": [
        "https://example.com/office-style-1.jpg",
        "https://example.com/office-style-2.jpg"
      ]
    }
  }
}
referenceImageUrls cannot be used together with firstFrameImageUrl. Use one or the other, not both.

Examples

Example 1: Single Story Without Prompt

One scene with a story paragraph and reference images to guide style. The system splits the story into multiple scenes and auto-generates prompts. The reference images influence the visual style of all generated video clips.
import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const API_KEY = "YOUR_API_KEY";

async function createReferenceImagesAutoPrompt() {
  const response = await axios.post(
    `${API_BASE_URL}/v2/video/storyboard/render`,
    {
      videoName: "reference-images-auto-prompt",
      language: "en",
      backgroundMusic: { enabled: true, autoMusic: true, volume: 0.5 },
      voiceOver: {
        enabled: true,
        aiVoices: [{ speaker: "Jackson", speed: 100, amplificationLevel: 0 }]
      },
      scenes: [
        {
          story: "Every day, millions of content creators, educators, and marketers face the same challenge. They need professional-quality videos but do not have a production team. Hours are spent searching stock libraries for footage that almost fits. Days are lost waiting for design assets. AI-powered video creation changes everything. With a simple text prompt, you can generate custom visuals tailored to your exact narrative. No more settling for generic stock footage. Whether you are building an online course, launching a campaign, or growing your social media presence, AI visuals give you the power to create stunning content in minutes.",
          createSceneOnNewLine: true,
          createSceneOnEndOfSentence: true,
          background: {
            type: "video",
            aiVisual: {
              model: "pixverse5.5",
              videoDuration: "8s",
              referenceImageUrls: [
                "https://example.com/style-reference-1.jpg",
                "https://example.com/style-reference-2.jpg"
              ]
            }
          }
        }
      ]
    },
    { headers: { "Content-Type": "application/json", Authorization: API_KEY } }
  );

  const jobId = response.data.data.jobId;
  console.log("Job ID:", jobId);
  let jobCompleted = false;
  while (!jobCompleted) {
    const statusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${jobId}`, { headers: { Authorization: API_KEY } });
    const status = statusResponse.data.data.status;
    console.log("Status:", status);
    if (status === "completed") { jobCompleted = true; console.log("Video URL:", statusResponse.data.data.videoURL); }
    else if (status === "failed") { throw new Error("Failed: " + JSON.stringify(statusResponse.data)); }
    await new Promise((resolve) => setTimeout(resolve, 15000));
  }
}

createReferenceImagesAutoPrompt();

Example 2: Single Story with Creative Direction

Same as Example 1, but with a prompt that acts as creative direction for the entire video. Since the story is split into multiple scenes, the prompt guides the overall visual tone rather than describing a specific scene. A good creative direction prompt follows this structure: [Action/Movement] + [Scene/Environment] + [Camera Technique] + [Visual Style].
import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const API_KEY = "YOUR_API_KEY";

async function createReferenceImagesCreativeDirection() {
  const response = await axios.post(
    `${API_BASE_URL}/v2/video/storyboard/render`,
    {
      videoName: "reference-images-creative-direction",
      language: "en",
      backgroundMusic: { enabled: true, autoMusic: true, volume: 0.5 },
      voiceOver: {
        enabled: true,
        aiVoices: [{ speaker: "Jackson", speed: 100, amplificationLevel: 0 }]
      },
      scenes: [
        {
          story: "Every day, millions of content creators, educators, and marketers face the same challenge. They need professional-quality videos but do not have a production team. Hours are spent searching stock libraries for footage that almost fits. Days are lost waiting for design assets. AI-powered video creation changes everything. With a simple text prompt, you can generate custom visuals tailored to your exact narrative. No more settling for generic stock footage. Whether you are building an online course, launching a campaign, or growing your social media presence, AI visuals give you the power to create stunning content in minutes.",
          createSceneOnNewLine: true,
          createSceneOnEndOfSentence: true,
          background: {
            type: "video",
            aiVisual: {
              model: "pixverse5.5",
              videoDuration: "8s",
              referenceImageUrls: ["https://example.com/style-reference-1.jpg", "https://example.com/style-reference-2.jpg"],
              prompt: "Smooth cinematic tracking shots in modern creative workspaces with warm golden-hour lighting and shallow depth of field"
            }
          }
        }
      ]
    },
    { headers: { "Content-Type": "application/json", Authorization: API_KEY } }
  );

  const jobId = response.data.data.jobId;
  console.log("Job ID:", jobId);
  let jobCompleted = false;
  while (!jobCompleted) {
    const statusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${jobId}`, { headers: { Authorization: API_KEY } });
    const status = statusResponse.data.data.status;
    console.log("Status:", status);
    if (status === "completed") { jobCompleted = true; console.log("Video URL:", statusResponse.data.data.videoURL); }
    else if (status === "failed") { throw new Error("Failed: " + JSON.stringify(statusResponse.data)); }
    await new Promise((resolve) => setTimeout(resolve, 15000));
  }
}

createReferenceImagesCreativeDirection();

Example 3: Multiple Scenes with Prompts

Three separate scenes, each with a one-sentence story, a scene-specific prompt, and reference images. This example shows a mix of one and two reference images across scenes.
import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const API_KEY = "YOUR_API_KEY";

async function createReferenceImagesScenePrompts() {
  const response = await axios.post(
    `${API_BASE_URL}/v2/video/storyboard/render`,
    {
      videoName: "reference-images-scene-prompts",
      language: "en",
      backgroundMusic: { enabled: true, autoMusic: true, volume: 0.5 },
      voiceOver: {
        enabled: true,
        aiVoices: [{ speaker: "Jackson", speed: 100, amplificationLevel: 0 }]
      },
      scenes: [
        {
          story: "Content creators struggle to find the perfect visuals for their videos.",
          background: {
            type: "video",
            aiVisual: {
              model: "pixverse5.5",
              videoDuration: "5s",
              referenceImageUrls: ["https://example.com/creator-workspace-style.jpg"],
              prompt: "A frustrated creator scrolling through endless stock footage on a laptop in a dimly lit room"
            }
          }
        },
        {
          story: "AI-powered tools generate custom visuals from simple text prompts in minutes.",
          background: {
            type: "video",
            aiVisual: {
              model: "pixverse5.5",
              videoDuration: "5s",
              referenceImageUrls: ["https://example.com/tech-workspace-style.jpg", "https://example.com/ai-interface-style.jpg"],
              prompt: "A bright modern workspace with AI-generated visuals appearing on a large monitor"
            }
          }
        },
        {
          story: "Professional-quality videos are now accessible to educators and marketers everywhere.",
          background: {
            type: "video",
            aiVisual: {
              model: "pixverse5.5",
              videoDuration: "5s",
              referenceImageUrls: ["https://example.com/professional-presentation-style.jpg"],
              prompt: "Diverse professionals confidently presenting polished video content on various devices"
            }
          }
        }
      ]
    },
    { headers: { "Content-Type": "application/json", Authorization: API_KEY } }
  );

  const jobId = response.data.data.jobId;
  console.log("Job ID:", jobId);
  let jobCompleted = false;
  while (!jobCompleted) {
    const statusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${jobId}`, { headers: { Authorization: API_KEY } });
    const status = statusResponse.data.data.status;
    console.log("Status:", status);
    if (status === "completed") { jobCompleted = true; console.log("Video URL:", statusResponse.data.data.videoURL); }
    else if (status === "failed") { throw new Error("Failed: " + JSON.stringify(statusResponse.data)); }
    await new Promise((resolve) => setTimeout(resolve, 15000));
  }
}

createReferenceImagesScenePrompts();

Example 4: Multiple Scenes Without Prompts

Three separate scenes, each with a one-sentence story and reference images. No prompts are provided, so the system auto-generates a visual prompt from each scene’s story text.
import axios from "axios";

const API_BASE_URL = "https://api.pictory.ai/pictoryapis";
const API_KEY = "YOUR_API_KEY";

async function createReferenceImagesAutoScenes() {
  const response = await axios.post(
    `${API_BASE_URL}/v2/video/storyboard/render`,
    {
      videoName: "reference-images-auto-scenes",
      language: "en",
      backgroundMusic: { enabled: true, autoMusic: true, volume: 0.5 },
      voiceOver: {
        enabled: true,
        aiVoices: [{ speaker: "Jackson", speed: 100, amplificationLevel: 0 }]
      },
      scenes: [
        {
          story: "Content creators struggle to find the perfect visuals for their videos.",
          background: { type: "video", aiVisual: { model: "pixverse5.5", videoDuration: "5s", referenceImageUrls: ["https://example.com/creator-workspace-style.jpg"] } }
        },
        {
          story: "AI-powered tools generate custom visuals from simple text prompts in minutes.",
          background: { type: "video", aiVisual: { model: "pixverse5.5", videoDuration: "5s", referenceImageUrls: ["https://example.com/tech-workspace-style.jpg", "https://example.com/ai-interface-style.jpg"] } }
        },
        {
          story: "Professional-quality videos are now accessible to educators and marketers everywhere.",
          background: { type: "video", aiVisual: { model: "pixverse5.5", videoDuration: "5s", referenceImageUrls: ["https://example.com/professional-presentation-style.jpg"] } }
        }
      ]
    },
    { headers: { "Content-Type": "application/json", Authorization: API_KEY } }
  );

  const jobId = response.data.data.jobId;
  console.log("Job ID:", jobId);
  let jobCompleted = false;
  while (!jobCompleted) {
    const statusResponse = await axios.get(`${API_BASE_URL}/v1/jobs/${jobId}`, { headers: { Authorization: API_KEY } });
    const status = statusResponse.data.data.status;
    console.log("Status:", status);
    if (status === "completed") { jobCompleted = true; console.log("Video URL:", statusResponse.data.data.videoURL); }
    else if (status === "failed") { throw new Error("Failed: " + JSON.stringify(statusResponse.data)); }
    await new Promise((resolve) => setTimeout(resolve, 15000));
  }
}

createReferenceImagesAutoScenes();

Tracking AI Credits Used

When your video includes AI-generated visuals, the job response includes an aiCreditsUsed field that reports the total AI credits consumed across all scenes. This field is present only when at least one scene used aiVisual configuration.
{
    "job_id": "a1d36612-326d-4b81-aece-411f8aed4c70",
    "success": true,
    "data": {
        "status": "completed",
        "aiCreditsUsed": 48
    }
}
Use this value to track credit consumption and manage your AI credit budget. For detailed job response documentation, refer to the Get Storyboard Preview Job or Get Video Render Job API reference.

Best Practices

When using two reference images:
  • Choose images that complement each other rather than conflict
  • One image can provide color/mood while the other provides composition/structure
  • Avoid two images with vastly different styles, as the AI may produce inconsistent results
Since you are generating video, your prompt should describe motion and action:
  • Good: “Camera slowly panning across a landscape with clouds drifting overhead”
  • Poor: “A beautiful landscape” (describes a static scene)
The reference images handle the visual style. Let the prompt focus on what should move and how.
When using veo3.1 or veo3.1_fast with reference images, the duration is automatically set to "8s". Plan your scene duration accordingly:
  • If you need shorter clips, consider using pixverse5.5 instead
  • If "8s" works for your scene, veo3.1_fast provides higher quality with reference images
  • All image URLs must be publicly accessible (no authentication required)
  • Use direct image URLs (not page URLs that contain images)
  • Supported formats include JPEG, PNG, and WebP
  • The array must contain 1–2 URLs (minimum 1, maximum 2)

Troubleshooting

Cause: referenceImageUrls is used with type: "image".Resolution:
  1. Change type to "video" if you want to generate video clips
  2. For image generation, use referenceImageUrl (singular) instead
Cause: Both firstFrameImageUrl and referenceImageUrls are provided in the same scene.Resolution:
  1. firstFrameImageUrl controls the starting frame; referenceImageUrls guides overall style
  2. Choose the approach that fits your use case and remove the other field
Cause: When using referenceImageUrls with veo3.1 or veo3.1_fast, the duration is automatically set to "8s".Resolution:
  1. This is expected behavior and cannot be overridden for these models
  2. If you need a different duration, use pixverse5.5 instead, which supports "5s", "8s", and "10s" with reference images
Cause: The AI model may weigh the text prompt more heavily than the reference images.Resolution:
  1. Use reference images with strong, distinctive style characteristics
  2. Simplify your text prompt to give the reference images more influence
  3. Try a different model for better style adherence

Next Steps

Visual Continuity

Create seamless transitions between consecutive scenes

First Frame Image

Control the starting frame of AI-generated video clips

Reference Image for Images

Guide AI image generation with a reference image

AI-Generated Video Clips

Learn the basics of AI video clip generation

API Reference

Render Storyboard Video

Direct video rendering with AI visuals

Create Storyboard Preview

Create preview before rendering