MPI-INF Logo
Homepage

Contact

Firstname Lastname

Dr. Vladislav Golyanik

Research Group Leader
4D and Quantum Vision
Max-Planck-Institut für Informatik
D6: Visual Computing and Artificial Intelligence
 office: Campus E1 4, Room 219
Saarland Informatics Campus
66123 Saarbrücken
Germany
 email: golyanik at mpi hyphen inf dot mpg dot de
 phone: +49 681 9325-4505
 fax: +49 681 9325-7505

NEWS

Open Positions

  • PhD positions, post-doc positions and internships are available. Check how to apply.
  • If you are interested in a bachelor/master thesis, an internship, research immersion lab (RIL) or a HiWi position on one of the topics listed below, do not hesitate to reach me out.

Research Profile

    I am currently leading "4D and Quantum Vision" group at Max Planck Institute for Informatics, D6 Department. The focus of our team lies on 3D reconstruction and analysis of general deformable scenes, 3D reconstruction of the human body and matching problems on point sets and graphs. We are interested in neural approaches (both supervised and unsupervised), physics-based methods as well as new hardware and sensors (e.g., quantum computers and event cameras).

    Many research questions at the intersection of computer graphics, computer vision and machine learning involve challenging search problems (e.g., graph matching) or the optimisation of non-convex objectives. For such problems, we develop new algorithmic formulations that can be solved on modern adiabatic quantum annealers or universal quantum computers and investigate which advantages these approaches offer compared to existing classical methods.

    Our reserach interests include (but are not limited to):
    • 3D Reconstruction and Neural Rendering of Rigid and Non-Rigid Scenes
    • Quantum Algorithms for Computer Vision and Graphics
    • Event-based Approaches in Vision and Graphics

Slides/Recordings of Recent Talks

Publications

Technical Reports

    Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model.
    E. C. M. Johnson, M. Habermann, S. Shimada, V. Golyanik and C. Theobalt.
    ArXiv, 2022.
    [paper] [project page] [source code] [data] [bibtex]

    Description: Our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can operate on monocular RGB videos directly by using differentiable volume rendering. This technique includes three new---in the context of non-rigid 3D reconstruction---components, i.e., 1) A coordinate-based and implicit neural representation for non-rigid scenes, which in conjunction with differentiable volume rendering enables an unbiased reconstruction of dynamic scenes, 2) a proof that extends the unbiased formulation of volume rendering to dynamic scenes, and 3) a novel dynamic scene flow loss, which enables the reconstruction of larger deformations by leveraging the coarse estimates of other methods.

    MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis.
    R. Dabral, M. H. Mughal, V. Golyanik and C. Theobalt.
    ArXiv, 2022.
    [paper] [project page] [bibtex]

    Description: We introduce MoFusion, i.e., a new denoising-diffusion-based framework for high-quality conditional human motion synthesis that can synthesise long, temporally plausible, and semantically accurate motions based on a range of conditioning contexts (such as music and text). We also present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework through our scheduled weighting strategy. The learned latent space can be used for several interactive motion-editing applications like in-betweening, seed-conditioning, and text-based editing, thus, providing crucial abilities for virtual-character animation and robotics.

    EventNeRF: Neural Radiance Fields from a Single Colour Event Camera.
    V. Rudnev, M. Elgharib, C. Theobalt, V. Golyanik.
    ArXiv, 2022.
    [paper] [project page] [bibtex]

    Description: This pre-print proposes the first approach for 3D-consistent, dense and photorealistic novel view synthesis using just a single colour event stream as input. At the core of our method is a neural radiance field trained entirely in a self-supervised manner from events while preserving the original resolution of the colour event channels. Next, our ray sampling strategy is tailored to events and allows for data-efficient training. At test, our method produces results in the RGB space at unprecedented quality.

2023

    QuAnt: Quantum Annealing with Learnt Couplings.
    M. Seelbach Benkner, M. Krahn, E. Tretschk,
    Z. Lähner, M. Moeller and V. Golyanik.
    International Conference on Learning Representations (ICLR), 2023;
    Spotlight (Top 25%).
    [project page] [paper] [bibtex]

    Abstract: This paper proposes to learn QUBO forms from data through gradient backpropagation instead of deriving them. As a result, the solution encodings can be chosen flexibly and compactly. Furthermore, our methodology is general and virtually independent of the specifics of the target problem type. We demonstrate the advantages of learnt QUBOs on the diverse problem types of graph matching, 2D point cloud alignment and 3D rotation estimation.


    State of the Art in Dense Monocular Non-Rigid 3D Reconstruction.
    E. Tretschk*, N. Kairanda*, M. B R, R. Dabral, A. Kortylewski, B. Egger, M. Habermann, P. Fua, C. Theobalt and V. Golyanik.
    * equal contribution.
    Conditionally Accepted in Eurographics 2023 (Full STARs).
    [draft] [project page] [bibtex]

    Description: This survey focuses on state-of-the-art methods for dense non-rigid 3D reconstruction of various deformable objects and composite scenes from monocular videos or sets of monocular views. It reviews the fundamentals of 3D reconstruction and deformation modeling from 2D image observations. We then start from general methods—that handle arbitrary scenes and make only a few prior assumptions—and proceed towards techniques making stronger assumptions about the observed objects and types of deformations (e.g. human faces, bodies, hands, and animals). A significant part of this STAR is also devoted to classification and a high-level comparison of the methods, as well as an overview of the datasets for training and evaluation of the discussed techniques. We conclude by discussing open challenges in the field and the social aspects associated with the usage of the reviewed methods.

    Scene-Aware 3D Multi-Human Motion Capture from a Single Camera.
    D. Luvizon, M. Habermann, V. Golyanik, A. Kortylewski and C. Theobalt.
    Conditionally Accepted in Eurographics 2023.
    [paper] [project page]

    Description: We introduce the first non-linear optimization-based approach that jointly solves for the absolute 3D position of each human, their articulated pose, their individual shapes as well as the scale of the scene. Given the per-frame 3D estimates of the humans and scene point-cloud, we perform a space-time coherent optimization over the video to ensure temporal, spatial and physical plausibility. We consistently outperform previous methods and we qualitatively demonstrate that our method is robust to in-the-wild conditions including challenging scenes with people of different sizes.

    IMoS: Intent-Driven Full-Body Motion Synthesis for Human-Object Interactions.
    A. Ghosh, R. Dabral, V. Golyanik, C. Theobalt and P. Slusallek.
    Conditionally Accepted in Eurographics 2023.
    [project page] [paper] [bibtex]

    Description: We synthesize the full-body pose sequences along with the 3D object positions from textual inputs. Our method can synthesize single-handed as well as two-handed interactions depending on the intent and the type of the object used.

2022

    HiFECap: Monocular High-Fidelity and Expressive Capture of Human Performances.
    Y. Jiang, M. Habermann, V. Golyanik and C. Theobalt.
    British Machine Vision Conference (BMVC), 2022.
    [project page] [paper] [bibtex]

    Description: We propose HiFECap, a new neural human performance capture approach, which simultaneously captures human pose, clothing, facial expression, and hands just from a single RGB video. We demonstrate that our proposed network architecture, the carefully designed training strategy, and the tight integration of parametric face and hand models to a template mesh enable the capture of all these individual aspects. Importantly, our method also captures high-frequency details, such as deforming wrinkles on the clothes, better than the previous works.

    Generation of Truly Random Numbers on a Quantum Annealer.
    H. Bhatia, E. Tretschk, C. Theobalt and V. Golyanik.
    More details coming soon.
    [project page] [paper] [bibtex]

    Description: We discuss the observed qubits' properties and their influence on the random number generation and consider various physical factors that influence the performance of our generator, i.e., digital-to-analogue quantisation errors, flux errors, temperature errors and spin bath polarisation. The numbers generated by the proposed algorithm successfully pass various tests on randomness from the NIST test suite.

    Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization.
    A. Yurtsever, T. Birdal and V. Golyanik.
    European Conference on Computer Vision (ECCV), 2022.
    [project page] [paper] [bibtex] [poster]

    Description: We present a hybrid classical-quantum framework based on the Frank-Wolfe algorithm, Q-FW, for solving quadratic, linearly-constrained, binary optimization problems on quantum annealers (QA). Q-FW first reformulates constrained-QBO as a copositive program (CP), then employs Frank-Wolfe iterations to solve CP while satisfying linear (in)equality constraints. This procedure unrolls the original constrained-QBO into a set of unconstrained QUBOs all of which are solved, in a sequel, on a QA. We use D-Wave Advantage QA to conduct synthetic and real experiments on two important computer vision problems, graph matching and permutation synchronization, which demonstrate that our approach is effective in alleviating the need for an explicit regularization coefficient.

    UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture.
    H. Akada, J. Wang, S. Shimada, M. Takahashi, C. Theobalt and V. Golyanik.
    European Conference on Computer Vision (ECCV), 2022.
    [project page] [paper] [bibtex]

    Description: UnrealEgo is a large-scale naturalistic dataset for egocentric 3D human pose estimation. It is based on an advanced concept of eyeglasses equipped with two fisheye cameras that can be used in unconstrained environments. The experiments show that our simple yet effective approach for egocentric 3D human motion capture outperforms the previous methods.

    Quantum Motion Segmentation.
    F. Arrigoni, W. Menapace, M. Seelbach Benkner, E. Ricci and V. Golyanik.
    European Conference on Computer Vision (ECCV), 2022.
    [project page] [paper] [bibtex]

    Abstract: Motion segmentation is a challenging problem that seeks to identify independent motions in two or several input images. This paper introduces the first algorithm for motion segmentation that relies on adiabatic quantum optimization of the objective function. The proposed method achieves on-par performance with the state of the art on problem instances which can be mapped to modern quantum annealers.

    HULC: 3D HUman Motion Capture with Pose Manifold Sampling and Dense Contact Guidance.
    S. Shimada, V. Golyanik, Z. Li, P. Pérez, W. Xu and C. Theobalt.
    European Conference on Computer Vision (ECCV), 2022.
    [paper] [project page]

    Neural Radiance Fields for Outdoor Scene Relighting.
    V. Rudnev, M. Elgharib, W. Smith, L. Liu, V. Golyanik and C. Theobalt.
    European Conference on Computer Vision (ECCV), 2022.
    [paper] [project page] [bibtex]

    MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes.
    Z. Li, S. Shimada, B. Schiele, C. Theobalt and V. Golyanik.
    International Conference on 3D Vision (3DV), 2022; Oral.
    Best Student Paper Award.
    [project page] [paper] [bibtex]

    Description: Our MoCapDeform algorithm is the first that models non-rigid scene deformations and finds the accurate global 3D poses of the subject by human-deformable scene interaction constraints, achieving increased accuracy with significantly fewer penetrations.

    φ-SfT: Shape-from-Template with a Physics-Based Deformation Model.
    N. Kairanda, E. Tretschk, M. Elgharib, C. Theobalt and V. Golyanik.
    Computer Vision and Pattern Recognition (CVPR), 2022.
    [paper] [project page] [source code] [bibtex]


    Playable Environments: Video Manipulation in Space and Time.
    W. Menapace, S. Lathuilière*, A. Siarohin, C. Theobalt*, S. Tulyakov*, V. Golyanik*, and E. Ricci*.
    * equal senior contribution.
    Computer Vision and Pattern Recognition (CVPR), 2022.
    [paper] [project page] [github] [bibtex]

    Advances in Neural Rendering.
    A. Tewari*, J. Thies*, B. Mildenhall*, P. Srinivasan*, E. Tretschk, Y. Wang, C. Lassner, V. Sitzmann, R. Martin-Brualla, S. Lombardi, C. Theobalt, M. Niessner, J. T. Barron, G. Wetzstein, M. Zollhöfer and V. Golyanik.
    * equal contribution.
    State of the Art Report at Eurographics 2022.
    [paper] [project page] [bibtex]

2021

    Convex Joint Graph Matching and Clustering via Semidefinite Relaxations.
    M. Krahn, F. Bernard and V. Golyanik.
    International Conference on 3D Vision (3DV), 2021.
    [paper] [project page] [bibtex]

    HumanGAN: A Generative Model of Human Images.
    K. Sarkar, L. Liu, V. Golyanik, and C. Theobalt.
    International Conference on 3D Vision (3DV), 2021; Oral
    [paper] [project page] [bibtex]

    HandVoxNet++: 3D Hand Shape and Pose Estimation using Voxel-Based Neural Networks.
    J. Malik, S. Shimada, A. Elhayek, S. A. Ali, C. Theobalt, V. Golyanik and D. Stricker.
    Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021.
    [IEEE Xplore] [arXiv.org] [project page] [bibtex]


    Gravity-Aware 3D Human-Object Reconstruction.
    R. Dabral, S. Shimada, A. Jain, C. Theobalt and V. Golyanik.
    International Conference on Computer Vision (ICCV), 2021.
    [paper] [project page] [bibtex]



    Q-Match: Iterative Shape Matching via Quantum Annealing.
    M. Seelbach Benkner, Z. Lähner, V. Golyanik, C. Wunderlich, C. Theobalt and M. Moeller.
    International Conference on Computer Vision (ICCV), 2021.
    [paper] [project page] [bibtex]


    Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video.
    E. Tretschk, A. Tewari, V. Golyanik, M. Zollhöfer, C. Lassner and C. Theobalt.
    International Conference on Computer Vision (ICCV), 2021.
    [paper] [project page] [source code] [bibtex]

    Neural Monocular 3D Human Motion Capture with Physical Awareness.
    ("Neural PhysCap")

    S. Shimada, V. Golyanik, W. Xu, P. Pérez and C. Theobalt.
    SIGGRAPH, 2021.
    [paper] [arXiv] [bibtex] [project page] [source code]


    High-Fidelity Neural Human Motion Transfer from Monocular Video.
    M. Kappel, V. Golyanik, M. Elgharib, J.-O. Henningson, H.-P. Seidel, S. Castillo, C. Theobalt and M. Magnor.
    Computer Vision and Pattern Recognition (CVPR), 2021; Oral.
    [paper] [project page] [bibtex] [source code]

    Pose-Guided Human Animation from a Single Image in the Wild.
    J. S. Yoon, L. Liu, V. Golyanik, K. Sarkar, H. S. Park, and C. Theobalt.
    Computer Vision and Pattern Recognition (CVPR), 2021.
    [paper] [project page] [video] [bibtex]


    Fast Gravitational Approach for Rigid Point Set Registration with Ordinary Differential Equations.
    S. A. Ali, K. Kahraman, C. Theobalt, D. Stricker and V. Golyanik.
    IEEE Access, 2021.
    [paper] [arXiv] [project page] [bibtex]

2020

    PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time.
    S. Shimada, V. Golyanik, W. Xu and C. Theobalt.
    SIGGRAPH Asia, 2020.
    [paper (arXiv.org)] [bibtex] [project page]

    Egocentric Videoconferencing.
    M. Elgharib*, M. Mendiratta*, J. Thies, M. Nießner, H.-P. Seidel, A. Tewari,
    V. Golyanik and C. Theobalt.
    * equal contribution.
    SIGGRAPH Asia, 2020.
    [draft] [supplement] [bibtex] [project page]

    Fast Simultaneous Gravitational Alignment of Multiple Point Sets.
    V. Golyanik, S. Shimada and C. Theobalt.
    3DV, 2020; Oral.
    [draft] [bibtex] [project page]

    Adiabatic Quantum Graph Matching with Permutation Matrix Constraints.
    M. Seelbach Benkner, V. Golyanik, C. Theobalt and M. Moeller.
    3DV, 2020.
    [draft] [supplement] [bibtex] [project page]



    HTML: A Parametric Hand Texture Model for 3D Hand Reconstruction and Personalization.
    N. Qian, J. Wang, F. Müller, F. Bernard, V. Golyanik and C. Theobalt.
    European Conference on Computer Vision (ECCV), 2020.
    [paper] [supplement] [video] [bibtex] [project page]




    A Quantum Computational Approach to Correspondence Problems on Point Sets.
    V. Golyanik and C. Theobalt.
    In Computer Vision and Pattern Recognition (CVPR), 2020.
    [paper] [slides] [poster] [bibtex] [arXiv] [project page]

    EventCap: Monocular 3D Capture of High-Speed Human Motions using an Event Camera.
    L. Xu, W. Xu, V. Golyanik, M. Habermann, L. Fang and C. Theobalt.
    In Computer Vision and Pattern Recognition (CVPR), 2020; Oral
    [paper] [supplement] [bibtex] [arXiv] [project page]


    HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation from a Single Depth Map.
    J. Malik, I. Abdelaziz, A. Elhayek, S. Shimada, S. A. Ali, V. Golyanik, C. Theobalt and D. Stricker.
    In Computer Vision and Pattern Recognition (CVPR), 2020.
    [paper] [supplement] [bibtex] [arXiv] [project page]



2019

    Structure from Articulated Motion: Accurate and Stable Monocular 3D Reconstruction without Training Data.
    O. Kovalenko, V. Golyanik, J. Malik, A. Elhayek and D. Stricker.
    Sensors (Volume 19, Issue 20), 2019.
    [paper] [project page]


    A Shape Completion Component for Monocular Non-Rigid SLAM.
    Y. Su, V. Golyanik, N. Minaskan, S. A. Ali and D. Stricker.
    International Symposium on Mixed and Augmented Reality (ISMAR), 2019.
    [paper] [supplement (video, 15 MB)] [MSCC Dataset] [bibtex]

    DispVoxNets: Non-Rigid Point Set Alignment with Supervised Learning Proxies.
    S. Shimada, V. Golyanik, E. Tretschk, D. Stricker and C. Theobalt.
    In International Conference on 3D Vision (3DV), 2019; Oral
    [paper] [poster] [presentation] [project page] [arXiv] [bibtex]

    Optimising for Scale in Globally Multiply-Linked Gravitational Point Set Registration Leads to Singularities.
    V. Golyanik and C. Theobalt.
    In International Conference on 3D Vision (3DV), 2019; Spotlight
    [paper] [supplement (pdf)] [poster] [video] [bibtex]

    FACE IT!: A Pipeline For Real-Time Performance-Driven Facial Animation.
    J. M. Dı́az Barros, V. Golyanik, K. Varanasi and D. Stricker.
    International Conference on Image Processing ICIP, 2019; Oral (Lecture)
    [paper] [bibtex]

    IsMo-GAN: Adversarial Learning for Monocular Non-Rigid 3D Reconstruction.
    S. Shimada, V. Golyanik, C. Theobalt and D. Stricker.
    Computer Vision and Pattern Recognition Workshops
    (Photogrammetric Computer Vision Workshop), 2019; Oral
    [paper] [bibtex] [arXiv] [project page]

    Consolidating Segmentwise Non-Rigid Structure from Motion.
    V. Golyanik, A. Jonas and D. Stricker.
    Machine Vision Applications (MVA), 2019; Oral
    [paper] [project page] [bibtex]

2018
    NRGA: Gravitational Approach for Non-Rigid Point Set Registration.
    S. A. Ali. V. Golyanik and D. Stricker.
    International Conference on 3D Vision (3DV), 2018; Oral
    [paper] [Supplementary Video (Download, YouTube)] [poster] [bibtex]


    HDM-Net: Monocular Non-Rigid 3D Reconstruction with Learned Deformation Model.
    V. Golyanik, S. Shimada, K. Varanasi and D. Stricker.
    EuroVR, 2018; Oral (Long Paper)
    [paper] [HDM-Net data set] [bibtex]

    Improving Time-Of-Flight Sensor for Specular Surfaces With
    Shape from Polarozation.

    T. Yoshida, V. Golyanik, O. Wasenmüller and D. Stricker.
    International Conference on Image Processing (ICIP), 2018.
    [paper] [bibtex]

    Classification of LIDAR Sensor Contaminations with Deep Neural Networks.
    J. K. James, G. Puhlfürst, V. Golyanik and D. Stricker.
    ACM Chapters Computer Science in Cars Symposium (CSCS), 2018; Oral
    [paper] [bibtex]



2017

    Multiframe Scene Flow with Piecewise Rigid Motion.
    V. Golyanik, K. Kim, R. Maier, M. Nießner, D. Stricker and J. Kautz.
    International Conference on 3D Vision (3DV), 2017; Spotlight Oral
    [paper] [arXiv] [supplementary material] [poster] [bibtex]

    Scalable Dense Monocular Surface Reconstruction.
    M.D.Ansari, V. Golyanik and D. Stricker.
    International Conference on 3D Vision (3DV), 2017.
    [paper] [arXiv] [bibtex]

    High-Dimensional Model for Dense Monocular Surface Recovery.
    V. Golyanik and D. Stricker.
    International Conference on 3D Vision (3DV), 2017.
    [paper] [bibtex]

    Introduction to Coherent Depth Fields for Dense Monocular Surface Recovery.
    V. Golyanik, T. Fetzer and D. Stricker.
    British Machine Vision Conference (BMVC), 2017.
    [paper] [supplementary video] [bibtex]

    Towards Scheduling Hard Real-Time Image Processing Tasks
    on a Single GPU.

    V. Golyanik, M. Nasri and D. Stricker.
    International Conference on Image Processing (ICIP), 2017.
    [paper] [bibtex]

    A Framework for an Accurate Point Cloud Based Registration of Full 3D Human Body Scans.
    V. Golyanik, G. Reis, B. Taetz and D. Stricker.
    Machine Vision Applications (MVA), 2017.
    [paper] [bibtex]

    Dense Batch Non-Rigid Structure from Motion in a Second.
    V. Golyanik and D. Stricker.
    Winter Conference on Applications of Computer Vision (WACV), 2017.
    [paper] [supplementary video] [poster] [bibtex]

    Accurate 3D Reconstruction of Dynamic Scenes from Monocular Image Sequences with Severe Occlusions.
    V. Golyanik, T. Fetzer and D. Stricker.
    Winter Conference on Applications of Computer Vision (WACV), 2017.
    [paper] [supplementary material] [poster] [arXiv] [bibtex]

2016


    Joint Pre-Alignment and Robust Rigid Point Set Registration.
    V. Golyanik, B. Taetz and D. Stricker.
    International Conference on Image Processing (ICIP), 2016.
    [paper] [bibtex]

    Gravitational Approach for Point Set Registration.
    V. Golyanik, S. A. Ali and D. Stricker.
    Computer Vision and Pattern Recognition (CVPR), 2016.
    [paper] [supplementary material] [bibtex]

    Extended Coherent Point Drift Algorithm with Correspondence Priors and Optimal Subsampling.
    V. Golyanik, B. Taetz, G. Reis and D. Stricker.
    Winter Conference on Applications of Computer Vision (WACV), 2016.
    [paper] [poster] [bibtex] [WACV Talk]

    Occlusion-Aware Video Registration for Highly Non-Rigid Objects.
    B. Taetz, G. Bleser, V. Golyanik and D. Stricker.
    Winter Conference on Applications of Computer Vision (WACV), 2016.
    Best Paper Award.
    [paper] [supplementary material] [bibtex] [WACV Talk]

2015

    Precise and Automatic Anthropometric Measurement Extraction using Template Registration.
    O. Wasenmüller, J. C. Peters, V. Golyanik and D. Stricker.
    International Conference on 3D Body Scanning Technologies (3DBST), 2015.
    [paper] [bibtex]