The Storyboard Preview API generates a preview of your video project, allowing you to review the storyboard structure, scene breakdown, visual selections, and configuration before committing to the final render. This is an essential step in the video creation workflow that helps you validate your content and make adjustments without incurring full rendering costs.
The storyboard preview creates a project with scene thumbnails and metadata. To generate the final rendered video, use the Render from Preview API after reviewing and approving the preview.
Recommended Workflow: Create Preview → Review Scenes → Make Adjustments → Render Final Video
Language of the text content.Allowed values:zh, nl, en, fr, de, hi, it, ja, ko, mr, pt, ru, es, tazh Chinese, nl Dutch, en English, fr French, de German, hi Hindi, it Italian, ja Japanese, ko Korean, mr Marathi, pt Portuguese, ru Russian, es Spanish, ta Tamil
Custom data object that will be included in the webhook POST payload when the job completes. Use this to pass through any metadata (e.g., internal IDs, tracking info) that you need in your webhook handler.
Experimental: This field is currently experimental. In the future, all storyboards will automatically use the latest version, and this field will be ignored. There is no need to remove it from your requests when that happens.
Specifies which storyboard version to use. Set to v3 to create a latest Pictory storyboard. Omit this field to use the classic storyboard.Options:
v3 — Uses the latest Pictory storyboard
Omit or any other value — Uses the classic/legacy storyboard
The following fields are nested inside the top-level voiceOver object. For example, voiceOver.enabled means enabled is a property of the voiceOver object.
The following fields are nested inside each element of the voiceOver.aiVoices array. For example, voiceOver.aiVoices[].speaker means speaker is a property of each item in the aiVoices array, which itself is nested inside the voiceOver object. The [] notation indicates an array element.
The following fields are nested inside premiumVoiceSettings, which is a property of each voiceOver.aiVoices[] element. For example, voiceOver.aiVoices[].premiumVoiceSettings.modelId means modelId is inside premiumVoiceSettings, inside each voice in the aiVoices array, inside the voiceOver object.
The following fields are nested inside the voiceOver.externalVoice object. For example, voiceOver.externalVoice.voiceUrl means voiceUrl is a property of externalVoice, which itself is nested inside the voiceOver object.
The following fields are nested inside the top-level backgroundMusic object. For example, backgroundMusic.enabled means enabled is a property of the backgroundMusic object.
The following fields are nested inside the subtitleStyle object. For example, subtitleStyle.fontFamily means fontFamily is a property of the subtitleStyle object. This object can be used at the top level or inside scenes[] as scenes[].subtitleStyle.
The following fields are nested inside each element of the subtitleStyle.animations array. For example, subtitleStyle.animations[].name means name is a property of each item in the animations array, which itself is nested inside the subtitleStyle object. The [] notation indicates an array element.
Controls how upcoming words appear on screen. Only available with fade and blur entry animations. When AI voiceover is enabled, words sync with the spoken audio.Options:
"hidden" — Upcoming words are invisible, words appear only when spoken
"subtle" — Upcoming words are faintly visible, current word is highlighted
"prominent" — All text is clearly visible, spoken word is highlighted brighter
The following fields are nested inside each element of the top-level destinations array. For example, destinations[].type means type is a property of each item in the destinations array. The [] notation indicates an array element.
The following fields are nested inside the privacy object of each destinations[] element. For example, destinations[].privacy.view means view is a property of privacy, which itself is nested inside each item of the destinations array.
Each scene can only have ONE content source. Choose one of: story, storyCoPilot, blogUrl, pptUrl, audioUrl, or videoUrl.
The following fields are nested inside each element of the top-level scenes array. For example, scenes[].story means story is a property of each item in the scenes array. The [] notation indicates an array element.
Maximum number of caption lines displayed simultaneously. Integer from 1 to 4. When omitted, all caption lines for the scene are displayed at once.
1 — Single line (ideal for TikTok/Reels)
2 — Two lines (standard for most content)
3 — Three lines (educational/detailed content)
4 — Four lines (information-heavy content)
Cannot be used with smart layouts (smartLayoutId or smartLayoutName). When AI voiceover is enabled, caption lines sync with the spoken audio. See Dynamic Captions Guide.
Language of audio content. Required for audioUrl and videoUrl.Allowed values:en-US, en-AU, en-GB, en-IN, en-IE, en-AB, en-WL, fr-CA, fr-FR, de-CH, de-DE, it-IT, es-ES, es-US, nl-NL, pt-BR, ja-JP, ko-KR, ru-RU, hi-IN, ta-IN, mr-INen-US English (US), en-AU English (Australia), en-GB English (UK), en-IN English (India), en-IE English (Ireland), en-AB English (Arabic accent), en-WL English (Wales), fr-CA French (Canada), fr-FR French (France), de-CH German (Switzerland), de-DE German (Germany), it-IT Italian (Italy), es-ES Spanish (Spain), es-US Spanish (US), nl-NL Dutch (Netherlands), pt-BR Portuguese (Brazil), ja-JP Japanese (Japan), ko-KR Korean (Korea), ru-RU Russian (Russia), hi-IN Hindi (India), ta-IN Tamil (India), mr-IN Marathi (India)
Language of caption text. Only valid with caption.Allowed values:zh, nl, en, fr, de, hi, it, ja, ko, mr, pt, ru, es, tazh Chinese, nl Dutch, en English, fr French, de German, hi Hindi, it Italian, ja Japanese, ko Korean, mr Marathi, pt Portuguese, ru Russian, es Spanish, ta Tamil
AI story generation configuration for creating video scripts automatically.
{ "storyCoPilot": { "prompt": "How AI is transforming video creation", "videoType": "Explainer", "duration": 60, "platform": "YouTube", "tone": "informative" }}
The following fields are nested inside the storyCoPilot object, which is a property of each scenes[] element. For example, scenes[].storyCoPilot.prompt means prompt is a property of storyCoPilot, which itself is nested inside each item of the scenes array.
The following fields are nested inside the voiceOver object at the scene level. For example, scenes[].voiceOver.enabled means enabled is a property of voiceOver, which itself is nested inside each item of the scenes array. This is separate from the top-level voiceOver object.
Background music configuration specific to a scene.
{ "backgroundMusic": { "enabled": true }}
The following fields are nested inside the backgroundMusic object at the scene level. For example, scenes[].backgroundMusic.enabled means enabled is a property of backgroundMusic, which itself is nested inside each item of the scenes array. This is separate from the top-level backgroundMusic object.
Preset position for the avatar in this scene. Options: top-left, top-center, top-right, center-left, center-center, center-right, bottom-left, bottom-center, bottom-right
Hide avatar for this specific scene (true or false)
Important: You cannot change avatarId at the scene level. The same avatar must be used throughout the video. You can only override styling and positioning properties.
Background can only have ONE of: visualUrl, color, or aiVisual - not multiple.
The following fields are nested inside the background object, which is a property of each scenes[] element. For example, scenes[].background.visualUrl means visualUrl is a property of background, which itself is nested inside each item of the scenes array.
AI-generated visual configuration. Use with background.type set to "image" for AI-generated images or "video" for AI-generated video clips.
The following fields are nested inside the aiVisual object, which is a property of scenes[].background. For example, scenes[].background.aiVisual.prompt means prompt is a property of aiVisual, which itself is nested inside background, inside each item of the scenes array.
AI Image Example
AI Video Clip Example
Video with First Frame
Image with Reference
Visual Continuity
{ "background": { "type": "image", "aiVisual": { "prompt": "A serene mountain landscape at sunset", "model": "seedream3.0", "mediaStyle": "photorealistic" } }}
{ "background": { "type": "video", "aiVisual": { "prompt": "A car driving through a desert highway at golden hour", "model": "pixverse5.5", "videoDuration": "8s", "firstFrameImageUrl": "https://example.com/car-on-highway.jpg" } }}
{ "background": { "type": "image", "aiVisual": { "prompt": "A modern office workspace with natural lighting", "model": "seedream3.0", "mediaStyle": "photorealistic", "referenceImageUrl": "https://example.com/office-reference.jpg" } }}
{ "scenes": [ { "story": "AI is poised to significantly impact educators and course creators on social media. By automating tasks like content generation, visual design, and video editing, AI will save time and enhance consistency. This allows creators to focus on higher-level strategies and ensures a cohesive brand presence.", "createSceneOnNewLine": true, "createSceneOnEndOfSentence": true "background": { "type": "video", "aiVisual": { "model": "pixverse5.5", "visualContinuity": true } } } ]}
Text prompt describing the visual to generate (max 500 characters). If omitted, the system auto-generates a prompt from the scene’s story text.When a story is split into multiple scenes (using createSceneOnEndOfSentence or createSceneOnNewLine), this prompt acts as a creative direction for the entire video rather than a prompt for a specific scene. The system uses it to guide the auto-generated prompts for each individual scene, ensuring a consistent visual tone. A good creative direction prompt follows this structure: [Action/Movement] + [Scene/Environment] + [Camera Technique] + [Visual Style]. For example: "Smooth cinematic tracking shots in modern creative workspaces with warm golden-hour lighting and shallow depth of field".
AI model to use. The valid options depend on background.type:Image models (when type is "image"): flux-schnell, seedream3.0, nanobanana, nanobanana-proVideo models (when type is "video"): pixverse5.5, veo3.1_fast, veo3.1
When set to true, enables visual continuity between consecutive AI-generated scenes. The system automatically uses the output of the previous scene as a reference for the next scene, creating smooth visual transitions. For video scenes, the last frame of the previous video is used. For image scenes, the generated image of the previous scene is used. This field works with both "video" and "image" types.Visual continuity applies in two scenarios:
Within the same story: Consecutive scenes that were split from the same original story use the previous scene’s output as a reference for the next scene.
Across consecutive stories: Continuity is also maintained between the last scene of one story and the first scene of the next story, enabling seamless transitions across story boundaries.
In both cases, visualContinuity must be set to true for the system to use the previous scene’s output as a reference. If visualContinuity is not set or is false, the following behavior applies for consecutive scenes from the same story:
Image scenes: All user-provided reference images (such as referenceImageUrl) are cleared, and visuals are generated independently.
Video scenes: Only firstFrameImageUrl is cleared if it was provided. However, referenceImageUrls are preserved and still used for generation.
This field has no effect on the first scene in a sequence. When continuity is active and the previous scene’s output is unavailable, the system falls back to generating the visual without a reference image.
URL of an image to use as the first frame of the generated video clip. The AI model generates a video that begins from this image and transitions into the content described in the prompt. Only applicable when type is "video". Cannot be used together with referenceImageUrls.
URL of a reference image to guide the style and composition of the generated image. Only applicable when type is "image". For video generation, use firstFrameImageUrl or referenceImageUrls instead.
An array of 1–2 reference image URLs to guide the style and composition of the generated video clip. Only applicable when type is "video". Cannot be used together with firstFrameImageUrl. For image generation, use referenceImageUrl instead.
When using referenceImageUrls (without firstFrameImageUrl) with the veo3.1 or veo3.1_fast models, the video duration is automatically set to "8s".
The following fields are nested inside the searchFilter object, which is a property of scenes[].background. For example, scenes[].background.searchFilter.category means category is a property of searchFilter, which itself is nested inside background, inside each item of the scenes array.
Custom search query to find relevant visuals for your scene (max 3000 characters). Use descriptive terms to get more accurate background media results (e.g., “business team collaborating in modern office”, “aerial view of city skyline at sunset”).
The following fields are nested inside the settings object, which is a property of scenes[].background. For example, scenes[].background.settings.mute means mute is a property of settings, which itself is nested inside background, inside each item of the scenes array.
The following fields are nested inside each element of the scenes[].backgroundBrolls array. For example, scenes[].backgroundBrolls[].brollClip means brollClip is a property of each item in the backgroundBrolls array, which itself is nested inside each item of the scenes array.
The following fields are nested inside each element of the scenes[].backgroundCorpus array. For example, scenes[].backgroundCorpus[].visualUrl means visualUrl is a property of each item in the backgroundCorpus array, which itself is nested inside each item of the scenes array.
The following fields are nested inside the scenes[].transcript array. For example, scenes[].transcript[].speakerId means speakerId is a property of each item in the transcript array, which itself is nested inside each item of the scenes array. Fields like scenes[].transcript[].words[].word go one level deeper — word is inside each item of the words array, which is inside each transcript element.
The following fields are nested inside the mediaRepurposeSettings object, which is a property of each scenes[] element. For example, scenes[].mediaRepurposeSettings.highlightLength means highlightLength is a property of mediaRepurposeSettings, which itself is nested inside each item of the scenes array.
The following fields are nested inside the templateOverride object, which is a property of each scenes[] element. For example, scenes[].templateOverride.sceneId means sceneId is a property of templateOverride, which itself is nested inside each item of the scenes array.
The following fields are nested inside each element of the scenes[].templateOverride.subtitles array. For example, scenes[].templateOverride.subtitles[].text means text is a property of each item in the subtitles array, which itself is nested inside templateOverride, inside each item of the scenes array.
The following fields are nested inside the scenes[].templateOverride.layers structure, which is an array of arrays. The [][] notation indicates that layers is an array of arrays — the outer [] represents each subtitle group, and the inner [] represents each layer element within that group. For example, scenes[].templateOverride.layers[][].layerId means layerId is a property of each layer element inside the nested array structure.
When the storyboard request is successfully submitted, a job is created and a job ID is returned. Use this job ID to poll the Get Storyboard Preview Job by ID endpoint for the completed storyboard preview.
Unique identifier for the storyboard preview job. Use this to track the job status and retrieve results via the Get Storyboard Preview Job by ID endpoint.
Once you have the jobId, poll the Get Storyboard Preview Job by ID endpoint to check the job status and retrieve the completed storyboard preview. Use a polling interval of 10–30 seconds.When the job completes, you will receive:
renderParams — The rendering data for the video. To customize the video, update this data and use the Render Video API to render with your modifications. To update the preview with changes, save the updated data using the Update Storyboard Elements API.
storyboard — The processed input storyboard. If you want to change anything in the original storyboard request, use this processed response to make a new request with updates — it will process faster since the storyboard is already broken down into scenes, voices, and other components.
previewUrl — The preview URL for the video storyboard. Use this to view the preview in a browser or embed it in an iframe for app integrations. See the Embed Preview Player guide for details.