Unsupervised Subject Detection via Remote-PPG

Subject detection is a crucial task for camera-based remote healthcare monitoring. Most existing methods in subject detection rely on supervised learning of physical appearance features. However, their performances are highly restricted to the pre-trained appearance model while still suffering from false detection of human-similar objects. In this paper, we propose a novel unsupervised method to detect alive subject in a video using physiological features. Our basic idea originates from the observation that only living skin tissue of a human presents pulse-signals, which can be exploited as the feature to distinguish human skin from non-human surfaces in videos. The proposed VPS method, named Voxel-Pulse-Spectral, consists of three steps: it (1) creates hierarchical voxels across the video for temporally parallel pulse extraction; (2) builds a similarity matrix for hierarchical pulse-signals based on their intrinsic properties; and (3) utilizes incremental sparse matrix decomposition with hierarchical fusion to robustly identify and combine the voxels that correspond to single/multiple subjects. Numerous experiments demonstrate the superior performance of VPS over a state-of-the-art method. On average, VPS improves 82.2% on the precision of skin-region detection; 595.5% on the Pearson correlation and 542.2% on Bland-Altman agreement of instant pulse-rate. ANOVA shows that in all-round evaluations, the improvements of VPS are significant. The proposed method is the first method that uses pulse to robustly detect alive subjects in realistic scenarios, which can be favorably applied for healthcare monitoring.