Research RadarFace DetectionarXivMay 2026

Monthly arXiv Radar

May 2026 Face Detection Papers: Presentation Attacks, Synthetic Face Gates, and One-Class Authenticity

Explicit face detector papers were sparse in May 2026, so this issue widens the lens to the front-end decisions that enterprise face systems increasingly bundle with detection: presentation attack checks, synthetic-face gates, and authenticity scoring before a face image is trusted downstream.

What This Month Signals

The front end of face systems is becoming more about trust than localization alone: real-time PAD, uncertainty-aware synthetic detection, and one-class authenticity are converging into a single acceptance layer.

Paper 012026-05-13cs.CV

Flow Augmentation and Knowledge Distillation for Lightweight Face Presentation Attack Detection

arXiv PDF

Authors & Institutions

Muhammad Shahid Jabbar

SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia

Muhammad Sohail Ibrahim

Interdisciplinary Research Center for Intelligent Secure Systems (IRC-ISS), King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia

Taha Hasan Masood Siddique

College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, China

Kejie Huang

College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou, China

Shujaat Khan

SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia

Department of Computer Engineering, College of Computing and Mathematics, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia

What Problem It Solves

The paper solves the inference-time cost problem created by optical-flow-based FacePAD methods.

Key Result

The distilled model reaches 0.0% HTER on Replay-Attack and Replay-Mobile, 0.94% HTER on ROSE-Youtu, 5.65% HTER on SiW-Mv2, 0.42% ACER on OULU-NPU, and 52 FPS on an NVIDIA Jetson Orin Nano.

Abstract

Face presentation attack detection (FacePAD) remains challenging under diverse spoofing representation, including 2D print and replay, 3D mask-based spoofing, makeup-induced appearance manipulation, and physical occlusions, as well as under varying capture conditions. Motion cues are highly discriminative for FacePAD but typically require explicit optical flow estimation, which introduces substantial computational overhead and limits real-time deployment. In this work, we leverage optical flow to enhance motion representation during training while eliminating the need for flow computation at inference. We propose a dual-branch teacher model that fuses appearance cues from RGB frames with motion cues derived from colorwheel-encoded optical flow, enabling effective modeling of micro-motions and temporal consistency. To enable efficient deployment, we introduce a knowledge distillation framework that transfers motion-aware knowledge from the flow-augmented teacher to a lightweight RGB-only student via logit distillation. As a result, the student implicitly learns motion-sensitive representations without requiring explicit flow estimation or additional feature extraction blocks at inference. Extensive experiments demonstrate strong performance across multiple benchmarks, achieving 0.0% HTER on Replay-Attack and Replay-Mobile, 0.94% HTER on ROSE-Youtu, 5.65% HTER on SiW-Mv2, and 0.42% ACER on OULU-NPU. The distilled student achieves performance comparable to or better than the teacher while significantly reducing parameters and FLOPs, achieving 52 FPS on an NVIDIA Jetson Orin Nano, indicating its suitability for real-time and resource-constrained FacePAD deployment.

Research Starting Point

Presentation attack detection has to recognize subtle motion cues while still running on embedded devices and camera-side hardware.

Method

A dual-branch teacher learns from RGB appearance and colorwheel-encoded optical flow, then a lightweight RGB-only student receives motion-aware knowledge through logit distillation.

Paper Summary

The practical contribution is that motion-aware presentation attack detection no longer has to pay the full inference cost of optical flow. A flow-augmented teacher transfers temporal liveness cues into a lightweight RGB student, making the approach more realistic for kiosks, mobile onboarding, and edge cameras that need fast spoof protection without server round trips.

Paper 022026-05-11cs.CV

Evidence-based Decision Modeling for Synthetic Face Detection with Uncertainty-driven Active Learning

arXiv PDF

Authors & Institutions

Qingchao Jiang

Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology, Shanghai, China

Zhenxuan Hou

School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China

Zhiying Zhu

School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China

Zhenxing Qian

College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China

Xinpeng Zhang

College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China

Zaiwang Gu

Institute of Advanced Intelligence and Computing, Agency for Science, Technology and Research (A*STAR), Singapore

What Problem It Solves

The work addresses overconfident softmax predictions and heavy labeling requirements in synthetic face detection.

Key Result

The authors report improved interpretability and generalization, including a 15% accuracy increase over existing SOTA baselines.

Abstract

With the rapid development of deep generative models, forged facial images are massively exploited for illegal activities. Although existing synthetic face detection methods have achieved significant progress, they suffer from the inherent limitation of overconfidence due to their reliance on the Softmax activation function. Thus, these methods often lead to unreliable predictions when encountering unknown Out-of-Distribution (OOD) images, and cannot ascertain the model's uncertainty in its prediction. Meanwhile, most existing methods require massive high-quality annotated data, which greatly limits their practicability across diverse scenarios. To address these limitations, we propose EMSFD (Evidence-based decision Modeling for Synthetic Face Detection with uncertainty-driven active learning), an approach designed to enhance detection reliability and generalizability. Specifically, EMSFD models class evidence using the Dirichlet distribution and explicitly incorporates model uncertainty into the prediction process. Furthermore, during training, the estimated uncertainty is exploited to prioritize more informative samples from the unlabeled pool for annotation, thereby reducing labeling cost and improving model generalization. Extensive experimental evaluations demonstrate that our method enhances the interpretability of synthetic face detection. Meanwhile, our method yields a 15\% increase in accuracy compared to existing state-of-the-art (SOTA) baselines, which demonstrates the superior detection performance and generalizability of our approach. Our code is available at: https://github.com/hzx111621/EMSFD.

Research Starting Point

Synthetic face detection is no longer a closed-set classification problem because new generators and out-of-distribution imagery keep arriving.

Method

EMSFD models class evidence with a Dirichlet distribution, exposes uncertainty in predictions, and uses uncertainty-driven active learning to select informative unlabeled samples for annotation.

Paper Summary

EMSFD reframes synthetic face detection as an uncertainty-aware decision process rather than a simple binary label. That matters in moderation, onboarding, and identity-risk workflows because the system can expose low-confidence cases for review or active labeling instead of returning overconfident predictions on unfamiliar generators.

Paper 032026-05-11cs.CV

Only Train Once: Uncertainty-Aware One-Class Learning for Face Authenticity Detection

arXiv PDF

Authors & Institutions

Qingchao Jiang

School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China

Zhenxuan Hou

School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China

Zhiying Zhu

School of Information Science and Engineering, East China University of Science and Technology, Shanghai, China

Zhenxing Qian

College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China

Xinpeng Zhang

College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China

Zaiwang Gu

Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore

What Problem It Solves

The paper reframes face forgery detection as one-class learning from authentic faces, aiming to unify DeepFake and fully synthesized face detection.

Key Result

On DF40 and ASFD, the paper reports strong generalization with 96.63% average accuracy and 98.83% average precision.

Abstract

The rapid evolution of generative paradigms has enabled the creation of highly realistic imagery, which escalating the risks of identity fraud and the dissemination of disinformation. Most existing approaches frame face forgery detection as a fully supervised binary classification problem. Consequently, these models typically exhibit significant performance decay when tasked with detecting forgeries from previously unseen generative paradigms. Furthermore, these methods focus exclusively on either DeepFakes or fully synthesized faces, thereby failing to provide a generalized framework for universal face forgery detection. In this paper, we address this challenge by introducing FADNet (Face Authenticity Detector Net), % a self-supervised framework that which reformulates face forgery detection as a one-class classification (OCC) task. By training exclusively on authentic facial data to capture their intrinsic representations, FADNet flags any image whose feature embedding deviates significantly from the learned distribution of real faces as a forgery. The framework incorporates Evidential Deep Learning (EDL) to quantify predictive uncertainty and utilizes a plug-and-play pseudo-forgery image generator (PFIG) to tighten decision boundaries around authentic data. Extensive experimental evaluations on the DF40 and ASFD benchmarks demonstrate that FADNet achieves superior performance and generalization capabilities. Specifically, FADNet substantially outperforms existing state-of-the-art (SOTA) methods, yielding a remarkable average accuracy of 96.63\% and an average precision of 98.83\%.

Research Starting Point

Face authenticity systems must handle unknown forgery generators without rebuilding a binary classifier every time a new method appears.

Method

FADNet learns the real-face distribution, adds Evidential Deep Learning for uncertainty, and uses a plug-and-play pseudo-forgery generator to tighten the decision boundary around authentic data.

Paper Summary

FADNet is valuable because it targets detector churn: instead of collecting examples from every new fake generator, it learns the distribution of authentic faces and treats strong deviations as suspicious. The uncertainty layer and pseudo-forgery boundary tightening make the idea more usable for teams that need a broad authenticity gate across deepfakes and fully synthetic faces.