← Back to Blog
Research RadarFace RecognitionarXivApril 2026

Monthly arXiv Radar

April 2026 Face Recognition Papers: Encrypted Matching, Event Cameras, and Mobile Inference

April 2026 face recognition research leaned toward deployment constraints rather than benchmark vanity. The strongest papers focused on protecting biometric templates during search, widening sensing options beyond RGB cameras, and pushing recognition quality onto mobile hardware without blowing latency budgets. That mix is especially relevant for buyers comparing privacy posture, hardware fit, and operating cost.

What This Month Signals

This month, the face recognition stack is getting more operational. Differentiation is moving toward secure deployment, non-RGB robustness, and practical latency-efficiency gains rather than only marginal benchmark improvements.

Paper 012026-04-01cs.CV

Lightweight, Practical Encrypted Face Recognition with GPU Support

Authors & Institutions

Gabrielle De Micheli

Advanced Security Team, LG Electronics, USA

Syed Mahbub Hafiz

Advanced Security Team, LG Electronics, USA

Geovandro Pereira

Advanced Security Team, LG Electronics, USA

Eduardo L. Cominetti

Advanced Security Team, LG Electronics, USA

Thales B. Paiva

Advanced Security Team, LG Electronics, USA

Universidade de São Paulo, São Paulo, Brazil

Jina Choi

Next-Generation Computing Research Lab, CTO Division, LG Electronics, South Korea

Marcos A. Simplicio Jr

Universidade de São Paulo, São Paulo, Brazil

Bahattin Yildiz

Advanced Security Team, LG Electronics, USA

What Problem It Solves

The paper tackles how to perform end-to-end encrypted face matching and identification without the huge memory and latency penalties that usually come with homomorphic search.

Key Result

The method reduces required rotation keys by about 91%, cuts client memory by roughly 14 GB, keeps server RAM below 10 GB for galleries up to 2^20 entries, and delivers GPU speedups of up to 17x over the CPU baseline. The authors report sub-second encrypted face recognition for galleries up to 2^15 entries.

Abstract

This paper studies encrypted similarity search for face recognition in client-server settings where embeddings are sensitive biometric data. It proposes a Baby-Step/Giant-Step diagonal algorithm and GPU-optimized CKKS kernels that cut memory overhead and speed up homomorphic matching enough to make private identification workflows more practical.

Research Starting Point

Enterprise face recognition increasingly runs as a client-server workflow where the client sends an embedding and the server performs gallery search. That architecture is operationally convenient but creates a serious privacy problem because face embeddings are persistent biometric identifiers. Prior fully homomorphic approaches showed the idea was possible, yet memory pressure and runtime still kept the design out of reach for many real deployments.

Method

The authors improve the earlier HyDia design with BSGS-Diagonal, which reuses precomputed rotations to evaluate consecutive matrix-vector products more efficiently and sharply reduces the rotation-key footprint. They then add GPU-aware similarity kernels on top of FIDESlib so ciphertext operations stay fused and avoid repeated CPU-GPU transfer overhead. Together, the algorithmic and systems changes move encrypted similarity search closer to something an infrastructure team could actually benchmark for production.

Paper Summary

The business value is straightforward: privacy-preserving face search is no longer only a compliance thought experiment. This paper shows that the infrastructure layer around secure biometric matching is becoming practical enough to matter in product and procurement conversations.

Paper 022026-04-08cs.CV

EventFace: Event-Based Face Recognition via Structure-Driven Spatiotemporal Modeling

Authors & Institutions

Qingguo Meng

State Key Laboratory of Opto-Electronic Information Acquisition and Protection Technology, Anhui University, Hefei, China

Anhui Provincial Key Laboratory of Secure Artificial Intelligence, Anhui University, Hefei, China

Anhui Provincial International Joint Research Center for Advanced Technology in Medical Imaging, Anhui University, Hefei, China

School of Artificial Intelligence, Anhui University, Hefei, China

Xingbo Dong

State Key Laboratory of Opto-Electronic Information Acquisition and Protection Technology, Anhui University, Hefei, China

Anhui Provincial Key Laboratory of Secure Artificial Intelligence, Anhui University, Hefei, China

Anhui Provincial International Joint Research Center for Advanced Technology in Medical Imaging, Anhui University, Hefei, China

School of Artificial Intelligence, Anhui University, Hefei, China

Zhe Jin

State Key Laboratory of Opto-Electronic Information Acquisition and Protection Technology, Anhui University, Hefei, China

Anhui Provincial Key Laboratory of Secure Artificial Intelligence, Anhui University, Hefei, China

Anhui Provincial International Joint Research Center for Advanced Technology in Medical Imaging, Anhui University, Hefei, China

School of Artificial Intelligence, Anhui University, Hefei, China

Massimo Tistarelli

Computer Vision Laboratory, University of Sassari, Sassari, Italy

What Problem It Solves

The work addresses how to recognize identity from event streams that lack stable texture and photometric detail, while still preserving enough structure for reliable matching.

Key Result

Across the reported experiments, EventFace outperforms baseline methods, stays stronger under poor lighting, and offers better privacy properties because the event representation is harder to reconstruct back into a conventional face image.

Abstract

EventFace explores identity recognition from event-camera streams, which are sparse, motion-centric signals that behave very differently from RGB images. The paper introduces the EFace dataset and a structure-driven spatiotemporal model that transfers knowledge from RGB face models while explicitly modeling motion prompts and temporal modulation.

Research Starting Point

RGB-based face recognition still weakens in precisely the environments where many deployments struggle most: low light, glare, and privacy-sensitive capture. Event cameras are appealing because they record sparse changes rather than full frames, but that same sensing model breaks the assumptions most face encoders rely on. The paper is motivated by the need to build a viable identity pipeline for this alternative sensor rather than forcing event data through an RGB recipe that does not fit.

Method

The authors create the EFace dataset and design a structure-driven spatiotemporal architecture around three pieces: LoRA-based transfer from RGB models, a Motion Prompt Encoder to capture motion cues, and a Spatiotemporal Modulator to fuse structural and temporal evidence. This lets the model treat event data as its own signal type instead of a degraded substitute for standard images.

Paper Summary

For teams evaluating next-generation sensing, this paper matters because it expands the definition of face recognition hardware. It suggests that low-light robustness and privacy-preserving capture may increasingly come from sensor choice as much as from a better loss function.

Paper 032026-04-11cs.CV

FaceLiVTv2: An Improved Hybrid Architecture for Efficient Mobile Face Recognition

Authors & Institutions

Novendra Setyawan

Department of Electro-Optics Engineering, National Formosa University, Taiwan

Department of Electrical Engineering, University of Muhammadiyah Malang, Indonesia

Chi-Chia Sun

Department of Electrical Engineering, National Taipei University, Taiwan

Mao-Hsiu Hsu

Department of Electro-Optics, National Formosa University, Taiwan

Wen-Kai Kuo

Department of Electro-Optics, National Formosa University, Taiwan

Jun-Wei Hsieh

College of Artificial Intelligence and Green Energy, National Yang Ming Chiao Tung University, Taiwan

What Problem It Solves

The problem is how to raise the accuracy-efficiency ceiling for mobile face recognition instead of forcing product teams to choose between acceptable latency and acceptable matching quality.

Key Result

The paper reports a 22% latency reduction versus FaceLiVTv1, speedups of up to 30.8% over GhostFaceNets on mobile devices, and 20% to 41% latency gains over EdgeFace and KANFace while maintaining stronger recognition accuracy on standard benchmarks.

Abstract

FaceLiVTv2 targets mobile face recognition with a lighter global-local interaction design. It replaces heavier attention blocks with Lite MHLA and integrates that module inside a unified RepMix block, improving the latency-accuracy trade-off across common face benchmarks and mobile hardware tests.

Research Starting Point

A growing share of commercial face recognition deployment now happens on mobile devices, embedded terminals, and edge hardware where power and latency budgets are tight. Many recent hybrid CNN-transformer models improve global context modeling, but they still carry interaction overhead that shows up as slow real-world inference. The paper is motivated by the need for a design that keeps most of the recognition gains while becoming easier to ship on constrained hardware.

Method

FaceLiVTv2 introduces Lite MHLA, a simplified global token interaction module built around lightweight linear projections and affine rescaling, then folds it into a RepMix block that coordinates local and global features more cleanly. The architecture also uses global depthwise convolution in the embedding stage to keep spatial aggregation adaptive without making the model too heavy for phones or edge accelerators.

Paper Summary

This is the kind of paper enterprise buyers should notice because it translates model design directly into deployment economics. Better mobile accuracy is useful, but the larger signal is that edge-ready face recognition is becoming a more disciplined systems problem.