Demonstrates that minimizing distance correlation between raw data and intermediary representations reduces leakage of sensitive raw data patterns across client communications while maintaining model accuracy.
Leakage Invertibility/reconstruction of raw data from intermediary representation
The solution prevent such reconstruction of raw data while maintaining information required to sustain good classification accuracies. The approach is based on minimizing a statistical dependency measure called distance correlation.
Distance Correlation A powerful measure of non-linear (and linear) statistical dependence between random variables.
In worst-case reconstruction attack settings, the attacker has access to a leaked subset of samples of training data along with corresponding transformed activations ata chosen layer, the outputs of which are always exposed to other client/server by design for the distributed learning of the deep learning network to be possible.
Before applying NoPeek

After applying NoPeek

Two popular distributed learning settings where this attack is highly relevant:
Moreover, model extraction, model inversion, malicious training, adversarial examples (evasion attacks) and membership inference etc.
Existing Solutions
Deep learning, adversarial learning and information theoretic loss based privacy
The proposed solution is not necessarily tied to a generative adversarial network (GAN) styled architecture where two separate models have to be trained in tandem. The proposed model is based on a easily implementable differentiable loss function between the intermediate activations and the raw data.
Homomorphic encryption and secure multi-party computation for computer vision
HE and MPC techniqes although highly secure are not computationally scalable and communication efficient for complex tasks like training large deep learning models.
The proposed method on the other hand is communication efficient and highly scalable with regards to large deep learning achitectures.
Differential privacy for computer vision
These methods typically take a stronger hit on accuracy of deep learning models although at the benefit of attempting to provide worst case privacy guarantees for membership inference attacks
Method

The key idea of the proposed method is to reduce information leakage by adding an additional loss term (distance correlation) to the commonly used classification loss term of categorical crossentropy.
Reconstruction Attack Testbed

Privacy-Utility Tradeoff on UTKFace

We show l2 error of reconstruction of a baseline strategy of adding uniform noise (in red) to activations of the layer being protected. This results in a model of no classififcation utility (performs at chance accuracy) albeit while preventing reconstruction. Our NoPeek approach (in blue) attains a much greater classfication accuracy for the downstream task ( 0.82) compared to adding uniform noise ( chance accuracy) while still preventing reconstruction of raw data. This is compared to regular training, that does not prevent the reconstruction (in green).

