OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data
Cloning camera motion from reference videos is an important task in video generation, as videos provide intuitive and...
Source Evidence
What Changed
Cloning camera motion from reference videos is an important task in video generation, as videos provide intuitive and...
Why It Matters
OmniDirector provides the first scalable, data‑rich representation that makes true multi‑shot camera movement controllable in diffusion‐based video generation, opening the door for automated “director‑level” editing tools that could replace or augment human cinematographers. By training on a million‑strong camera‑grid‑video corpus, it sets a new performance benchmark that forces competitors to explore more efficient motion encodings or larger datasets to stay relevant.
Confirmed Facts
Cloning camera motion from reference videos is an important task in video generation, as videos provide intuitive and precise control. Existing methods either directly use parametric representations that fail to handle multi-shot generation or synthesize cross-paired data, which suffer from data scarcity, resulting in poor performance in complicated camera motion cloning. To address these issues, we introduce a general camera motion representation that encodes cameras as grid motion videos. This camera grid represents the camera parameters visually and supports the integration of diverse trajectories for multi-shot video generation. Building upon this, we propose OmniDirector, a unified framework trained on a million-scale camera grid-video pairs that coordinates characters, actions, and cameras to provide director-level control for multimodal diffusion transformers. Furthermore, we design a novel hierarchical prompt expansion agent that harmoniously integrates different control signals by systematically describing camera motion and visual content through understanding signal relationships. Extensive experiments demonstrate the superior performance and outstanding controllability of our framework. Project page: https://ymlinfeng.github.io/OmniDirector.github.io/
Who Is Affected
- AI product teams
What To Watch Next
- Watch for independent replications, benchmark scrutiny, and whether labs turn this work into shipped systems.
- Watch whether additional sources confirm the same claim.
You will be redirected to huggingface.co.