Haoran Wang (PhD Student)

MSc Haoran Wang

Address: Max-Planck-Institut für Informatik
Saarland Informatics Campus
Campus E1 4
66123 Saarbrücken
Standort: E1 4 - 116
Telefon: +49 681 9325 2148
Fax: +49 681 9325 2099
E-mail: Get email via email

Personal Information

Github | Homepage | CV | Google Scholar | LinkedIn

About Me

I am currently a PhD student in Computer Vision at Max Planck Institute for Informatics. I am interested in the research of Computer Vision and Computer Graphics, especially Domain Adaptation and Domain Generalization.

Experience

Dec. 2020 - Dec. 2021: Computer Vision Researcher at Sensetime, China
Feb. 2020 - Sep. 2020: Master thesis at Computer Vision and Learning Group, ETH Zürich, Switzerland
Mar. 2019 - Dec. 2019: Research intern at JD AI Research, China

Education

Dec. 2021 - Present: PhD student in Computer Vision, Max Planck Institute for Informatics, Germany
Sep. 2017 - Nov. 2020: MSc in Computer Science, ETH Zürich, Switzerland
Sep. 2013 - Jun. 2017: BSc in Computer Science, Sichuan University, China

Publications

2025

1
Conference paper
D2
X. Hu, H. Wang, J. E. Lenssen, and B. Schiele
“PersonaHOI: Effortlessly Improving Personalized Face with Human-Object Interaction Generation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025), Nashville, TN, USA.
mehr
Abstract
We introduce PersonaHOI, a training- and tuning-free framework that fuses a
general StableDiffusion model with a personalized face diffusion (PFD) model to
generate identity-consistent human-object interaction (HOI) images. While
existing PFD models have advanced significantly, they often overemphasize
facial features at the expense of full-body coherence, PersonaHOI introduces an
additional StableDiffusion (SD) branch guided by HOI-oriented text inputs. By
incorporating cross-attention constraints in the PFD branch and spatial merging
at both latent and residual levels, PersonaHOI preserves personalized facial
details while ensuring interactive non-facial regions. Experiments, validated
by a novel interaction alignment metric, demonstrate the superior realism and
scalability of PersonaHOI, establishing a new standard for practical
personalized face with HOI generation. Our code will be available at
github.com/JoyHuYY1412/PersonaHOI

2024

2
Paper
D2D6
H. Wang, M. Mendiratta, C. Theobalt, and A. Kortylewski
“FaceGPT: Self-supervised Learning to Chat about 3D Human Faces,” 2024. [Online]. Available: https://arxiv.org/abs/2406.07163.
mehr
Abstract
We introduce FaceGPT, a self-supervised learning framework for Large
Vision-Language Models (VLMs) to reason about 3D human faces from images and
text. Typical 3D face reconstruction methods are specialized algorithms that
lack semantic reasoning capabilities. FaceGPT overcomes this limitation by
embedding the parameters of a 3D morphable face model (3DMM) into the token
space of a VLM, enabling the generation of 3D faces from both textual and
visual inputs. FaceGPT is trained in a self-supervised manner as a model-based
autoencoder from in-the-wild images. In particular, the hidden state of LLM is
projected into 3DMM parameters and subsequently rendered as 2D face image to
guide the self-supervised learning process via image-based reconstruction.
Without relying on expensive 3D annotations of human faces, FaceGPT obtains a
detailed understanding about 3D human faces, while preserving the capacity to
understand general user instructions. Our experiments demonstrate that FaceGPT
not only achieves high-quality 3D face reconstructions but also retains the
ability for general-purpose visual instruction following. Furthermore, FaceGPT
learns fully self-supervised to generate 3D faces based on complex textual
inputs, which opens a new direction in human face analysis.

MSc Haoran Wang

About Me

Experience

Education

Abstract

Abstract