Privacy in machine learning is critical, especially when models are trained on sensitive data. Differential privacy (DP) offers a framework to protect individual privacy by ensuring that the inclusion or exclusion of any data point doesn’t significantly affect a model’s output. A key technique for integrating DP into machine learning is Differentially Private Stochastic Gradient Descent (DP-SGD).
DP-SGD, a technique that modifies traditional SGD by clipping gradients to a maximum norm and adding Gaussian noise to the sum of these clipped gradients, has been a significant development in the field. However, it’s not without its challenges. While it ensures privacy, it often degrades model performance. Recent work has aimed to reduce this performance loss, proposing methods like adaptive noise injection and optimized clipping strategies. However, balancing privacy and accuracy remains a complex and ongoing challenge, especially in large-scale models with greater noise impact. Tuning for robustness, ensuring transferability, and maintaining performance across tasks are persistent challenges in DP-SGD that the research community is actively addressing.
Addressing these challenges, a dedicated research team has recently introduced DPAdapter, a novel technique designed to enhance parameter robustness in differentially private machine learning (DPML). This innovative method, which uses two batches for accurate perturbation estimates and effective gradient descent, significantly mitigates the adverse effects of DP noise on model utility. By enhancing the robustness of model parameters, DPAdapter leads to better performance in privacy-preserving models. Theoretical analysis has unveiled intrinsic connections between parameter robustness, transferability, and the impacts of DPML on performance, offering new insights into the design and fine-tuning of pre-trained models.
The study evaluates the effectiveness of different DPML algorithms using three private downstream tasks, CIFAR-10, SVHN, and STL-10, across four different pre-training settings. In the first stage, pre-training is conducted using the CIFAR-100 dataset with various methods, including training from scratch, standard pre-training, Vanilla SAM, and the proposed method, DPAdapter. A ResNet20 model is trained for 1,000 epochs with specific hyperparameters, such as a learning rate decay schedule and momentum.
In the second stage, the pre-trained models are fine-tuned on the private downstream datasets with different privacy budgets (ε = 1 and ε = 4) using DP-SGD and three additional DP algorithms: GEP, AdpAlloc, and AdpClip. The fine-tuning process involves:
- Setting a clipping threshold.
- Using a batch size of 256.
- Applying the DP-SGD optimizer with momentum.
The results show that DPAdapter consistently improves downstream accuracy across all settings compared to the other pre-training methods. For instance, with ε = 1 and DP-SGD, DPAdapter increases the average accuracy to 61.42% compared to 56.95% with standard pre-training. Similarly, with AdpClip, DPAdapter achieves a 10% improvement in accuracy, highlighting its effectiveness in enhancing model performance under privacy constraints.
In this study, the authors introduced DPAdapter, an innovative technique designed to enhance parameter robustness. This effectively addresses the often conflicting relationship between Differential Privacy noise and model utility in Deep Learning. DPAdapter achieves this by carefully reallocating batch sizes for perturbation and gradient calculations, and refining Sharpness-Aware Minimization algorithms to improve parameter robustness and reduce the impact of DP noise. Extensive evaluations across multiple datasets demonstrate that DPAdapter significantly improves the accuracy of DPML algorithms on various downstream tasks, underscoring its potential as a crucial technique for future privacy-preserving machine learning applications.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..
Don’t Forget to join our 50k+ ML SubReddit
⏩ ⏩ FREE AI WEBINAR: ‘SAM 2 for Video: How to Fine-tune On Your Data’ (Wed, Sep 25, 4:00 AM – 4:45 AM EST)
Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical science and a master’s degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep
networks.