๐ SANA-WM โ Camera-Controlled World Model
Image-to-video generation with 6-DoF camera control using Efficient-Large-Model/SANA-WM_bidirectional and the NVlabs/Sana Stage-1 pipeline.
๐ฎ Camera action queue
The output is the Sana VAE decode of Stage-1 latents (no refiner). For peak quality use the full pipeline with --no_refiner disabled offline.
Example image + prompt + camera action queue (lazy-cached)