Authors & Institutions
Jingtao Zhou
School of Mathematical Science, University of Science and Technology of China
Department of Computer Science, City University of Hong Kong
Xuan Gao
School of Mathematical Science, University of Science and Technology of China
Dongyu Liu
School of Mathematical Science, University of Science and Technology of China
Junhui Hou
Department of Computer Science, City University of Hong Kong
Yudong Guo
School of Mathematical Science, University of Science and Technology of China
Juyong Zhang
School of Mathematical Science, University of Science and Technology of China
What Problem It Solves
GSwap aims to make video head swapping more realistic by moving beyond 2D generation and shallow 3DMM assumptions.
Key Result
The authors report better visual quality, temporal coherence, identity preservation, and 3D consistency than prior head swapping methods, positioning GSwap as a strong signal that 3D-aware swap pipelines are maturing fast.
Abstract
We present GSwap, a novel consistent and realistic video head-swapping system empowered by dynamic neural Gaussian portrait priors, which significantly advances the state of the art in face and head replacement. Unlike previous methods that rely primarily on 2D generative models or 3D Morphable Face Models (3DMM), our approach overcomes their inherent limitations, including poor 3D consistency, unnatural facial expressions, and restricted synthesis quality. Moreover, existing techniques struggle with full head-swapping tasks due to insufficient holistic head modeling and ineffective background blending, often resulting in visible artifacts and misalignments. To address these challenges, GSwap introduces an intrinsic 3D Gaussian feature field embedded within a full-body SMPL-X surface, effectively elevating 2D portrait videos into a dynamic neural Gaussian field. This innovation ensures high-fidelity, 3D-consistent portrait rendering while preserving natural head-torso relationships and seamless motion dynamics. To facilitate training, we adapt a pretrained 2D portrait generative model to the source head domain using only a few reference images, enabling efficient domain adaptation. Furthermore, we propose a neural re-rendering strategy that harmoniously integrates the synthesized foreground with the original background, eliminating blending artifacts and enhancing realism. Extensive experiments demonstrate that GSwap surpasses existing methods in multiple aspects, including visual quality, temporal coherence, identity preservation, and 3D consistency.
Research Starting Point
Video face swapping has improved rapidly, but many systems still fail on the exact details that users notice first: 3D consistency, natural head motion, and seamless blending between the swapped head and the rest of the body. The authors are motivated by the limitations of 2D generators and 3DMM-based pipelines, which often produce artifacts once the task expands from face replacement to full head replacement. Their premise is that realistic commercial-quality swapping now depends on modeling a complete dynamic portrait rather than editing isolated facial texture.
Method
GSwap introduces a dynamic neural Gaussian portrait representation embedded in an SMPL-X body surface, allowing the method to model head, torso, and motion together instead of treating the face as an isolated 2D patch. The system adapts a pretrained portrait generator to the source identity using a few references, then performs neural re-rendering so the synthesized foreground integrates more naturally with the original background. This combination is designed to preserve identity, stabilize temporal motion, and avoid the detached or misaligned look common in earlier swapping systems.
Paper Summary
The paper is a strong signal that high-end face swapping is becoming a 3D video synthesis problem rather than a 2D image editing trick. By treating the head as part of a full dynamic portrait, GSwap improves realism in the places users care about most: motion, structure, and blending. For anyone tracking enterprise-grade face swap technology, this is one of the clearest March 2026 papers to watch.