
A team of researchers from Sapienza University of Rome has introduced WhoFi, a novel deep learning pipeline that leverages Wi-Fi signals for person re-identification (Re-ID), offering a robust and privacy-preserving alternative to traditional camera-based surveillance systems.
The published study proposes a deep neural network trained on Wi-Fi Channel State Information (CSI) to distinguish individuals based on how their bodies alter signal waveforms. The technique relies entirely on CSI-derived biometric features and demonstrates strong accuracy on the public NTU-Fi dataset, achieving a Rank-1 identification accuracy of 95.5% using a Transformer-based model.
Person Re-ID systems aim to match an individual's appearance across different cameras or time frames, a critical task in surveillance. Traditional methods predominantly use visual cues like clothing and body shape, but their effectiveness degrades under poor lighting, occlusions, or varying angles. WhoFi sidesteps these limitations by using the physical interaction between Wi-Fi signals and the human body, extracting unique signal distortions that effectively serve as radio-frequency biometric signatures.

Arxiv
The CSI data used in WhoFi comes from Wi-Fi transmissions between antennas, which record fine-grained amplitude and phase variations across multiple subcarriers. The researchers applied signal preprocessing, including amplitude filtering with the Hampel filter and phase sanitization techniques, to remove noise and standardize input. They further enhanced model robustness through data augmentation, simulating signal noise, strength fluctuations, and timing shifts.
The heart of the WhoFi pipeline is a modular neural architecture comprising an encoder and a signature module. The team evaluated three encoder types: LSTM, Bi-LSTM, and Transformer. The Transformer encoder outperformed the rest, with its attention mechanism proving especially effective at modeling long-range temporal dependencies in the CSI signal. The encoder output is passed through a linear layer and L2 normalization to produce the final biometric signature vector.
Training relied on an in-batch negative loss function, which uses all non-matching samples in a batch as negative examples. This formulation optimizes the embedding space such that samples from the same individual cluster together while maximizing separation from others. The similarity between probe and gallery samples is computed using dot-product-based cosine similarity, facilitated by normalized signature vectors.
The researchers used the NTU-Fi dataset for evaluation, which contains Wi-Fi CSI samples from 14 subjects under varying clothing and accessory conditions. Data was collected using TP-Link N750 routers in a controlled indoor environment. Transformer-based models achieved 88.4% mean Average Precision (mAP) and showed resilience across different sequence lengths and augmentation strategies.
An ablation study revealed that data augmentation improved performance for LSTM and Bi-LSTM encoders but had negligible benefit for Transformers, which performed robustly even without it. Interestingly, amplitude filtering sometimes degraded performance by removing potentially discriminative variations in the signal. Transformer models also showed reduced accuracy when deeper network layers were introduced, likely due to overfitting.
WhoFi not only surpasses previous Wi-Fi-based person identification systems in performance but also underscores the feasibility of deep learning approaches for radio biometric sensing. The use of public benchmarks like NTU-Fi ensures reproducibility and opens the door to more accessible comparisons in future research.
While WhoFi offers strong potential for privacy-conscious surveillance, any deployment should consider environmental calibration and legal implications of radio biometric profiling. Wi-Fi-based sensing should be confined to secure environments with clearly defined data governance policies.
Leave a Reply