Daily Paper Cast podcast

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

0:00
22:20
Reculer de 15 secondes
Avancer de 15 secondes

🤗 Upvotes: 46 | cs.GR, cs.CV

Authors:
Jiacheng Chen, Ramin Mehran, Xuhui Jia, Saining Xie, Sanghyun Woo

Title:
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

Arxiv:
http://arxiv.org/abs/2506.17450v2

Abstract:
We present BlenderFusion, a generative visual compositing framework that synthesizes new scenes by recomposing objects, camera, and background. It follows a layering-editing-compositing pipeline: (i) segmenting and converting visual inputs into editable 3D entities (layering), (ii) editing them in Blender with 3D-grounded control (editing), and (iii) fusing them into a coherent scene using a generative compositor (compositing). Our generative compositor extends a pre-trained diffusion model to process both the original (source) and edited (target) scenes in parallel. It is fine-tuned on video frames with two key training strategies: (i) source masking, enabling flexible modifications like background replacement; (ii) simulated object jittering, facilitating disentangled control over objects and camera. BlenderFusion significantly outperforms prior methods in complex compositional scene editing tasks.

D'autres épisodes de "Daily Paper Cast"