Back to Tutorials
Workflow
3D Animated News Broadcasts
Build a multi-shot animated short film from scratch using Sequencer's workflow editor. Generate 3D characters, write a script, animate 4 shots with speech, unify the voice, and layer B-roll.
What You'll Build
AI-generated character and environment assets
A 4-shot animated sequence with consistent characters
Unified voice across all shots using voice cloning
B-roll compositing for a polished final film
The Concept
A Pixar-style rabbit news anchor delivers breaking coverage of a pig who just robbed a bank and is now fleeing downtown on a Galápagos tortoise. Four shots. One character. Let's build it.
Step 1: Create Assets
1
Before anything else, generate reference images for every character and location. These references are what keep your visuals consistent across all 4 shots.
Create Text nodes with descriptions for each asset, connect them to Generate Text nodes (using a structured JSON schema for clean prompt output), then pipe those into Generate Image nodes using Nano Banana Pro. For this film we create four assets:
Rabbit News Anchor
Charming 3D CG rabbit in a navy suit behind a news desk. Octane render, subsurface scattering on fur, realistic eye caustics. Studio lighting with softboxes and HDRI reflections.
News Studio
Photorealistic 3D animated news studio interior with curved desk, city skyline backdrop, and broadcast monitors. Front-facing camera view, high-end CG.
Pig on Tortoise (Bank Robbery)
3D Pixar-style pig wearing a mask, riding a Galápagos tortoise and holding money bags. Presented as a viral news frame with lower-third graphics.
Pig on Tortoise (Street Chase)
Side angle of the pig on the tortoise fleeing through a downtown street with police cars behind. Same 3D cartoon style for visual consistency.
The asset group uses Text → Generate Text → Generate Image chains. Each text description feeds into an LLM that outputs a structured image prompt, which then generates the reference image.
These reference images will be wired into every shot's image generation node so the characters look the same across the entire film.
Step 2: Write the Script
2
Write your full script and break it into one Text node per shot. Also create a separate Character Description text node describing the rabbit's appearance and voice. Use Join Text nodes later to combine the character description with each shot's dialogue into the final video prompt.
SHOT 1
Opening Report
"We're following breaking news out of the financial district. Authorities are actively pursuing a suspect who just robbed the First Century Bank."
SHOT 2
Suspect Identity
"Police dispatch confirms the suspect, identified by witnesses as an adult pig, breached the vault and is currently fleeing southbound on Fifth Avenue."
SHOT 3
The Chase
"Aerial units report the suspect is mounted on a Galápagos tortoise. Law enforcement has secured a multi-block perimeter."
SHOT 4
Warning
"Though the pursuit is moving at less than one mile per hour, police warn the suspect is armed and highly volatile. Stay clear of the downtown corridor. We will bring you live chopper footage momentarily."
The script group: each Text node holds one shot's dialogue
Step 3: Generate Each Shot
3
Each shot follows the same two-node pattern: a Generate Image node creates the start frame, then a Generate Video node animates it with speech.
For the start image, use Nano Banana Pro, which is excellent at combining characters into scenes consistently when given reference images. Wire in the rabbit and studio references, and write an image prompt describing the exact framing for that shot (e.g., "medium shot, rabbit at desk, looking concerned").
For the video, use Veo 3.1 which generates high-quality audio and speech directly in the video. Connect the start image, add a video prompt describing the character's motion and acting, and include the shot's dialogue in quotes so the model generates lip-synced speech. The Join Text node combines the character description with the shot dialogue into one complete prompt.
Shot groups 1–4: each follows the same Generate Image → Generate Video pattern
Step 4: Unify the Voice
4
Each video generation produces a slightly different voice. To make the film sound like one continuous performance, clone a single voice across all shots.
Connect all 4 videos to a Stitch Videos node, then pipe the result into a Separate Audio node to extract the vocals. Feed the vocals into a Generate Audio node using ElevenLabs Voice Changer. Pick the best-sounding shot as the voice reference and it will normalize everything to match.
Generate background music with a separate audio node, then use a Mix Audio node to blend the cloned voice (full volume) with the music (lower volume). Finally, a Replace Audio node swaps the mixed result back onto the stitched video.
Voice unification: stitch, extract vocals, clone voice, mix with music, replace audio
Step 5: Add B-Roll & Finish
5
Generate additional B-roll clips of the pig riding the tortoise. Layer them on top of the anchor shots during the chase portion, just like a real news program cuts to footage.
Use the same pig/tortoise reference images from Step 1 to generate B-roll video clips with Veo 3.1. These are purely visual cutaways, no dialogue needed. Composite them over the stitched film and you've got a complete animated news broadcast.
Workflow Summary
Asset Generation
Characters + locations (Nano Banana Pro)
Script
4 shots, split into Text nodes
Video Generation
Start image → Veo 3.1 with dialogue
Voice Unification
ElevenLabs Voice Changer
Music
Generated + mixed at lower volume
B-Roll
Additional clips composited on top
The complete workflow: assets, script, 4 shots, voice unification, and B-roll compositing
The finished result: a complete 3D animated news broadcast
Try It on Sequencer
Clone this public workflow, swap in your own characters and script, and generate your own short film.
Read Next
Realistic Cinematic Short Ad
Hyperrealism
© 2026 Sequencer. All rights reserved.