Abstract

YouTube Video

Overall Pipeline

Main Pipeline
The pipeline to realize the multimodal 6-DoF immersive VR experiences. We applied a carefully designed rig to a) simultaneously capture multi-view video and audio. The b1) presents our proposed dynamic light field reconstruction framework while b2) demonstrates the construction process of sound field, details can be seen in Sec. 3. Ultimately, we achieve a 6-DoF audiovisual experience c), demonstrate the effectiveness of both our dataset and the proposed pipeline.

Dataset Snapshot

ImViD dataset collage
Capture strategy

Our goal is to construct a real-world immersive volumetric video dataset that jointly captures foreground and background content, thereby supporting research on spatial video reconstruction and VR/AR applications. To this end, we carefully design the entire data acquisition pipeline.

    Dynamic Light Field Reconstruction Pipeline

    Method Pipeline
    Overview of the proposed Dynamic Light Field Reconstruction method. The pipeline starts with Flow-Guided Sparse Initialization a), which leverages SfM geometry and optical flow priors to decouple static and dynamic regions, initializing static primitives globally and dynamic ones per frame to reduce redundancy. The scene is then represented using a Gaussian-Based Spatio-Temporal Representation b), encoding spatial geometry, appearance, and temporal dynamics, where motion is modeled by linear velocity and visibility by Gaussian-modulated temporal opacity. In the final stage, we jointly optimize the scene parameters and perform Joint Camera Temporal Calibration c) to refine per-camera temporal offsets for sub-frame alignment. Rendered color, depth, and optical flow maps are further supervised through Spatio-Temporal Supervision d), enforcing photometric, geometric, and motion consistency.

    Per-scene Interactive Benchmark

    Drag the divider, switch baselines, and inspect where our method is stronger under motion and occlusion.

    Qualitative Reel

    Click the left/right arrows to scroll.

    Some clips were compressed to satisfy GitHub file size limits.

    6-DoF VR Experience

    Real-time playback demonstrating fully immersive exploration with a guide camera.

    BibTeX