This paper presents UniPortrait, an innovative human image personalization framework that unifies single- and multi-ID customization with high face fidelity, extensive facial editability, free-form input description, and diverse layout generation. UniPortrait consists of only two plug-and-play modules: an ID embedding module and an ID routing module. The ID embedding module extracts versatile editable facial features with a decoupling strategy for each ID and embeds them into the context space of diffusion models. The ID routing module then combines and distributes these embeddings adaptively to their respective regions within the synthesized image, achieving the customization of single and multiple IDs. With a carefully designed two-stage training scheme, UniPortrait achieves superior performance in both single- and multi-ID customization. Quantitative and qualitative experiments demonstrate the advantages of our method over existing approaches as well as its good scalability, e.g., the universal compatibility with existing generative control tools.
Our proposed UniPortrait consists of two plug-and-play modules: an ID embedding module and an ID routing module. The ID embedding module extracts versatile editable facial features with a decoupling strategy for each ID, and the ID routing module combines and distributes these embeddings to their respective locations adaptively without the intervention for prompts and layouts. The entire training process of the framework is curated into two stages, i.e., the single-ID training stage and the multi-ID fine-tuning stage.
Scroll for more examples.
Scroll for more examples.
Qualitative comparison of different methods on single-ID image customization.
Qualitative comparison of different methods on multi-ID image customization. For compatibility with FastComposer, numerical plural expressions (e.g., “two men”) are converted into singular phrases linked by “and” (e.g., “a man and a man”).
Additional examples of multi-ID customization. UniPortrait is capable of customizing multi-ID images using free-form prompts and generating diverse layouts.
The superior performance of UniPortrait in aligning IDs, maintaining prompt consistency, as well as enhancing the diversity and quality of generated images, paves the way for a plethora of potential downstream applications.
Most of the face images used in our experiments come from the Pexels, Unsplash, Pixabay, and Wikipedia websites. We thank the owners of these images for sharing their valuable assets. We also thank the StyleGAN2 authors for sharing their high-quality synthesized face images, which constitute another important part of the ID images we used.
@article{he2024uniportrait,
title={UniPortrait: A Unified Framework for Identity-Preserving Single-and Multi-Human Image Personalization},
author={He, Junjie and Geng, Yifeng and Bo, Liefeng},
journal={arXiv preprint arXiv:2408.05939},
year={2024}
}