PhyCo: Learning Controllable Physical Priors for Generative Motion

Sriram Narayanan1,2 · Ziyu Jiang2 · Srinivasa G. Narasimhan1 · Manmohan Chandraker2,3

1 Carnegie Mellon University · 2 NEC Labs America · 3 UC San Diego

CVPR 2026

TL;DR PhyCo learns controllable physical priors — friction, restitution, deformation, and force — from simple block-sliding and ball-bouncing simulations, enabling physically grounded and continuously controllable video generation without any simulator at inference.

How PhyCo Works

A two-stage pipeline: physics-supervised ControlNet fine-tuning on simulation data, followed by VLM-guided reward optimization for physical consistency.

PhyCo two-stage pipeline: ControlNet physics fine-tuning followed by VLM reward alignment

100K+ Simulation Videos

Photorealistic block-sliding, ball-bouncing, and collision videos rendered with Kubric & PyBullet, with systematically varied physical properties.

Physics-Supervised ControlNet

A ControlNet conditioned on pixel-aligned physical property maps is trained on top of a frozen Cosmos-Predict2 video diffusion backbone.

VLM-Guided Reward

A fine-tuned Qwen2.5-VL evaluates generated videos with targeted physics questions, providing differentiable feedback to improve consistency.

Fine-Grained Control of Physical Attributes

Compositionality of Physical Attributes

Generalization to Different Styles

Scene 1
Scene 2
Scene 3
Scene 4
Prompt:
Scene 4
Scene 3
Scene 2
Scene 1
Prompt:

BibTeX

@misc{narayanan2026phyco,
  title = {PhyCo: Learning Controllable Physical Priors for Generative Motion},
  author = {Narayanan, Sriram and Jiang, Ziyu and Narasimhan, Srinivasa G. and Chandraker, Manmohan},
  year = {2026},
}

Acknowledgements

This work was partially conducted during Sriram's internship at NEC Labs America, and was supported in part by NSF grants IIS-2107236 and IIS-2513219. We also thank Kausik Sivakumar and Yug Ajmera for their insightful discussions.