What You Will Build
Text to Video
Generate videos from descriptive text prompts
Model Selection
Choose from multiple AI video models
Duration Control
Set the video length based on your needs
Aspect Ratio
Generate landscape, portrait, or square videos
Before You Begin
Make sure you have:- A Pictory API key (get one here)
- Node.js or Python installed on your machine
- The required packages installed
Step-by-Step Guide
Step 1: Set Up Your Request
Prepare your API credentials and define the video prompt along with the desired model, aspect ratio, and duration.Step 2: Submit the Video Generation Request
Send the request to the AI Studio video generation endpoint. On success, the API returns ajobId that you will use to poll for the result.
Step 3: Poll for the Result
Check the job status at regular intervals until the video is ready. The recommended polling interval is 10 to 30 seconds. Video generation typically takes longer than image generation.Understanding the Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | Yes | — | A descriptive text of the video to generate. Must be between 5 and 5,000 characters. |
model | string | No | pixverse5.5 | The AI model to use for generation. Supported values: veo3.1, veo3.1_fast, pixverse5.5. See Generate Video API for model capabilities and pricing. |
aspectRatio | string | No | First supported ratio of the selected model | The output aspect ratio. Valid values depend on the model. For example, pixverse5.5 supports 16:9, 9:16, 1:1, 3:4, 4:3, while veo3.1 supports 16:9, 9:16. |
duration | string | No | First supported duration of the selected model | The video length. Valid values depend on the model. For example, pixverse5.5 supports 5s, 8s, 10s, while veo3.1 supports 4s, 6s, 8s. |
webhook | string | No | — | A URL to receive a POST notification when the job completes. Must be a valid URI. |
Tips for Effective Video Prompts
- Describe motion and action. Unlike image prompts, video prompts benefit from describing movement. For example, “A bird taking flight from a tree branch” is more effective than “A bird on a tree.”
- Specify the camera angle. Terms such as “wide shot”, “close-up”, “tracking shot”, or “aerial view” help the model determine the framing and camera movement.
- Keep the scene focused. A single clear action with one or two subjects produces better results than a prompt describing multiple simultaneous events.
- Consider the duration. Shorter durations (4 to 5 seconds) work well for simple motions, while longer durations (8 to 10 seconds) allow for more complex scenes.
Next Steps
- Generate Video from First Frame to start a video from a specific image
- Generate Video from Reference Images to guide video generation using reference visuals
- Extend Video with AI to continue an existing video with new content
- Generate Video API Reference for the complete parameter documentation
