Pablo Piantanida (Full Professor, Laboratoire des Signaux et Systèmes – CentraleSupélec, CNRS, Université Paris-Saclay) ran a Seminar@SystemX online, on the following topic “Information-Theoretic Methods for Secure Machine Learning“, on December 15, 2021.
This event was part of the Confiance.ai scientific seminar series.
Deep learning models are known to be bad at signalling failure: These probabilistic models tend to make predictions with high confidence, and this is problematic in real-world applications to critical systems such as healthcare, self-driving cars, among others, where there are considerable safety implications, or where there are discrepancies between the training data and data at testing time that the model makes predictions on. There is a pressing need both for understanding when models predictions should (or should not) be trusted, detecting out-of-distribution examples and adversarial attacks, and in improving model robustness to natural changes in the data.
Another difficulty arises from many applications based on training data that include potentially sensitive information, e.g., lifestyle choices, medical diagnoses, purchase logs of sensitive items, genetic markers, or bio-metric features, which need to be kept confidential and not revealed to clients or customers. However, once trained, the software is typically made available to third parties, either directly, by selling the software itself, or indirectly, by allowing it to be queried. This access can be used to extract sensitive information about the training data, which is still present, hidden in a large number of parameters determining the trained model, raising a fundamental question about how much information can be, at least partially, extracted from trained software.
In this lecture, we will give an overview of those fundamental problems and key tasks. Namely, we examine model uncertainty and calibration, simple but still effective methods for detecting misclassification errors, detecting out-of-distribution and adversarial examples, improving robustness in deep learning, and novel tools to understand and quantify the information leakage of trained software. We will describe information-theoretic methods from fundamentals to state-of-the-art approaches, by going into a deep dive into promising avenues and will close by highlighting open challenges in the field.
Pablo Piantanida received the B.Sc. and the M.Sc. degrees in electrical engineering and mathematics from the University of Buenos Aires, Argentina, and the Ph.D. degree from Université Paris-Sud, Orsay, France, in 2007. He is currently Full Professor with the Laboratoire des Signaux et Systèmes (L2S) at CentraleSupélec together with CNRS and Université Paris-Saclay. He is also an associate member of Comète – Inria research team (Lix – Ecole Polytechnique). Pablo Piantanida is co-author of more than 45 indexed journals and more than 150 papers in international conference proceedings. He has served as the General Co-Chair for the 2019 IEEE International Symposium on Information Theory (ISIT). He served as an Associate Editor for the IEEE Transactions on Information Forensics and Security and Editorial Board of Section “Information Theory, Probability and Statistics” for Entropy, and served as Area Chair for several conferences in the fields of information theory and machine learning. His research interests include information theory, machine learning, security of learning systems (safety AI, privacy, fairness), and applications to computer vision, health, natural language processing, among others.