Abstract
We introduce PersonaHOI, a training- and tuning-free framework that fuses a
general StableDiffusion model with a personalized face diffusion (PFD) model to
generate identity-consistent human-object interaction (HOI) images. While
existing PFD models have advanced significantly, they often overemphasize
facial features at the expense of full-body coherence, PersonaHOI introduces an
additional StableDiffusion (SD) branch guided by HOI-oriented text inputs. By
incorporating cross-attention constraints in the PFD branch and spatial merging
at both latent and residual levels, PersonaHOI preserves personalized facial
details while ensuring interactive non-facial regions. Experiments, validated
by a novel interaction alignment metric, demonstrate the superior realism and
scalability of PersonaHOI, establishing a new standard for practical
personalized face with HOI generation. Our code will be available at
github.com/JoyHuYY1412/PersonaHOI
BibTeX
@online{Hu_2501.05823, TITLE = {{PersonaHOI}: {E}ffortlessly Improving Personalized Face with Human-Object Interaction Generation}, AUTHOR = {Hu, Xinting and Wang, Haoran and Lenssen, Jan Eric and Schiele, Bernt}, LANGUAGE = {eng}, URL = {https://arxiv.org/abs/2501.05823}, EPRINT = {2501.05823}, EPRINTTYPE = {arXiv}, YEAR = {2025}, MARGINALMARK = {$\bullet$}, ABSTRACT = {We introduce PersonaHOI, a training- and tuning-free framework that fuses a<br>general StableDiffusion model with a personalized face diffusion (PFD) model to<br>generate identity-consistent human-object interaction (HOI) images. While<br>existing PFD models have advanced significantly, they often overemphasize<br>facial features at the expense of full-body coherence, PersonaHOI introduces an<br>additional StableDiffusion (SD) branch guided by HOI-oriented text inputs. By<br>incorporating cross-attention constraints in the PFD branch and spatial merging<br>at both latent and residual levels, PersonaHOI preserves personalized facial<br>details while ensuring interactive non-facial regions. Experiments, validated<br>by a novel interaction alignment metric, demonstrate the superior realism and<br>scalability of PersonaHOI, establishing a new standard for practical<br>personalized face with HOI generation. Our code will be available at<br>https://github.com/JoyHuYY1412/PersonaHOI<br>}, }
Endnote
%0 Report %A Hu, Xinting %A Wang, Haoran %A Lenssen, Jan Eric %A Schiele, Bernt %+ Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society %T PersonaHOI: Effortlessly Improving Personalized Face with Human-Object Interaction Generation : %G eng %U http://hdl.handle.net/21.11116/0000-0010-793D-3 %U https://arxiv.org/abs/2501.05823 %D 2025 %X We introduce PersonaHOI, a training- and tuning-free framework that fuses a<br>general StableDiffusion model with a personalized face diffusion (PFD) model to<br>generate identity-consistent human-object interaction (HOI) images. While<br>existing PFD models have advanced significantly, they often overemphasize<br>facial features at the expense of full-body coherence, PersonaHOI introduces an<br>additional StableDiffusion (SD) branch guided by HOI-oriented text inputs. By<br>incorporating cross-attention constraints in the PFD branch and spatial merging<br>at both latent and residual levels, PersonaHOI preserves personalized facial<br>details while ensuring interactive non-facial regions. Experiments, validated<br>by a novel interaction alignment metric, demonstrate the superior realism and<br>scalability of PersonaHOI, establishing a new standard for practical<br>personalized face with HOI generation. Our code will be available at<br>https://github.com/JoyHuYY1412/PersonaHOI<br> %K Computer Science, Computer Vision and Pattern Recognition, cs.CV