Gradient Inversion Attacks in Federated Learning: Evaluating Privacy Risks and Differential Privacy Defenses in Cross-Silo Settings
Noor, Faiza (2025-06-18)
Gradient Inversion Attacks in Federated Learning: Evaluating Privacy Risks and Differential Privacy Defenses in Cross-Silo Settings
Noor, Faiza
(18.06.2025)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025062473494
https://urn.fi/URN:NBN:fi-fe2025062473494
Tiivistelmä
This thesis investigates the vulnerability of cross-silo Federated Learning (FL) systems to gradient-based privacy attacks, particularly focusing on the reconstruction of private image data from shared gradients. In cross-silo FL, a small number of relatively reliable and stable entities—such as hospitals, banks, or research institutions—collaborate to train a shared machine learning model without exchanging raw data. Each institution(or silo) computes local model updates based on its private dataset and periodically shares only gradients or model parameters with a central server, which aggregates them
to improve the global model. While this setup is designed to protect data privacy, recent research has shown that gradients themselves can unintentionally leak information about the underlying data.
To study this risk, we implement a gradient inversion pipeline based on the Deep Leakage from Gradients (DLG) method, using a simplified linear reconstruction approach. Our aim is to determine whether private images can be recovered from aggregated gradients in a cross-silo setting. Uniquely, this work isolates Differential Privacy (DP) as the only defense mechanism under evaluation, allowing for a focused analysis of its protective effect. We assess the quality of reconstructed images using the Learned Perceptual Image Patch Similarity (LPIPS) score, which captures perceptual similarity as judged by deep neural networks.
To analyze the effectiveness of DP, we systematically vary the gradient clipping threshold and Gaussian noise multiplier applied during Differentially Private Stochastic Gradient Descent(DP-SGD), thereby controlling and reporting the privacy budget (epsilon) over multiple training rounds. The reconstructed images are evaluated using LPIPS, allowing us to quantify the extent of perceptual leakage under different privacy settings. In addition to evaluating leakage, we monitor classification accuracy across individual clients throughout the federated training and testing process with and without the presence of attacks, thereby highlighting the inherent trade-off between model utility and privacy preservation.
Our results reveal that effective visual degradation (i.e., high LPIPS) begins to occur only at extremely high noise levels, often resulting in ϵ values well below standard deployment thresholds. This suggests that although DP can mitigate perceptual reconstruction under aggressive noise conditions, achieving meaningful formal privacy guarantees remains difficult in practice without compromising model performance. By isolating DP as the defense mechanism and LPIPS as the evaluation tool, this study provides a focused empirical exploration of privacy–utility dynamics in FL under linear reconstruction attacks.
to improve the global model. While this setup is designed to protect data privacy, recent research has shown that gradients themselves can unintentionally leak information about the underlying data.
To study this risk, we implement a gradient inversion pipeline based on the Deep Leakage from Gradients (DLG) method, using a simplified linear reconstruction approach. Our aim is to determine whether private images can be recovered from aggregated gradients in a cross-silo setting. Uniquely, this work isolates Differential Privacy (DP) as the only defense mechanism under evaluation, allowing for a focused analysis of its protective effect. We assess the quality of reconstructed images using the Learned Perceptual Image Patch Similarity (LPIPS) score, which captures perceptual similarity as judged by deep neural networks.
To analyze the effectiveness of DP, we systematically vary the gradient clipping threshold and Gaussian noise multiplier applied during Differentially Private Stochastic Gradient Descent(DP-SGD), thereby controlling and reporting the privacy budget (epsilon) over multiple training rounds. The reconstructed images are evaluated using LPIPS, allowing us to quantify the extent of perceptual leakage under different privacy settings. In addition to evaluating leakage, we monitor classification accuracy across individual clients throughout the federated training and testing process with and without the presence of attacks, thereby highlighting the inherent trade-off between model utility and privacy preservation.
Our results reveal that effective visual degradation (i.e., high LPIPS) begins to occur only at extremely high noise levels, often resulting in ϵ values well below standard deployment thresholds. This suggests that although DP can mitigate perceptual reconstruction under aggressive noise conditions, achieving meaningful formal privacy guarantees remains difficult in practice without compromising model performance. By isolating DP as the defense mechanism and LPIPS as the evaluation tool, this study provides a focused empirical exploration of privacy–utility dynamics in FL under linear reconstruction attacks.