Environnement and context of the research
It is undeniable that artificial intelligence is now critical to the competitiveness of French industry by contributing to innovation-based growth. In this context, the integration and/or safe use of artificial intelligence-based technologies is essential to support engineering, industrial production and the development of innovative products and services. « Industrialization of artificial intelligence for mission-critical systems » is one of the major objectives of the national Grand Défi Trust IA. This industrialization imperative requires providing an environment to support design, validation and testing. It will focus on reinforcing confidence, explainability, and even allow the certification of artificial intelligence. A group of major industrialists in the fields of Defense, Transportation and Energy has been formed to present the roadmap of this program confiance.ai, with the support of leading academic partners. The SystemX Technological Research Institute is coordinating this program.
The IRT SystemX is located at the heart of the Paris-Saclay scientific campus of world excellence, and has the ambitions of a world-class technological research center in the field of digital systems engineering. Its mission is to generate new knowledge and technological solutions based on breakthroughs in digital engineering and to disseminate its skills in all economic sectors.
The subject of the thesis has been defined by the consortium gathered in the framework of the confiance.ai program and more precisely in the EC3 project. The direction of the thesis will be ensured by Goran Frehse of the Computer Science and Systems Engineering (U2IS) laboratory from ENSTA, Paris and the thesis will be registered at the doctoral school IP Paris of Institut Polytechnique de Paris (ED 626).
The U2IS laboratory, led by David Filliat, is developing research in the field of design and reliability of systems integrating autonomous decision-making processes with applications in intelligent transport, robotics, defense and energy. The laboratory brings together the research activities of the ENSTA Paris School in computer science, robotics, vision, embedded systems, signal and image processing and hybrid system design and analysis.
In addition, the doctoral student will benefit from a scientific supervision in the confidence.ai program by Johanna BARO, the referent supervisor in the EC3 project. Within the IRT SystemX, the doctoral student will be hierarchically attached to the scientific axis « Sciences des données & Interaction » whose manager is Georges Hébrail.
The position is based in Palaiseau. The PhD student may be required to travel to the laboratory.
This PhD subject relates to the online monitoring of AI models set up to detect at runtime any deviation of an AI component deployed in operation from the specified expected behavior or from safe operation properties.
To illustrate the field of operation of this research, let us take the example of an aeronautical or automotive product whose tolerated malfunction rate has been specified at the system level at 10^(-3) failures per hour of operation, i.e. one failure of the product every 1000 hours of operation. This product has been developed using AI, and it must therefore be demonstrated that the AI model can perform its prediction over its entire usage domain with an accuracy of 99.9% and that this accuracy is maintained over time. In the case where, after a full training phase, the model’s performance does not exceed 99% correct predictions, 10 failures may statistically occur over the reference period (1000 hours) when only one failure would have been tolerated. This situation is unacceptable from a product safety point of view. The deployment of a monitoring device operating in parallel with the IA model (online monitoring) is a concrete way of managing this type of residual risk induced by a model for which it is not possible or feasible to formally demonstrate the achievement of the performance/safety objectives resulting from the system analyses. It is an architecture building block that is well known to safety engineers, but it should be adapted to AI technologies, which is the subject of this PhD subject.
State of the art
Control systems that integrate AI components pose a challenge when it comes to guaranteeing, or even certifying, their safety. AI components can be made safer through monitoring and supervision based on internal behavioral models that are dynamic, i.e., capture how the system and its environment evolve with time.
We consider envelope-based models, which bound future possible states based on past observations, possibly associated with probabilities. Envelope-based models can be used for monitoring, by interfering when the predicted envelopes overlap with critical states. For improving the confidence in AI responses, envelopes can be used to check for consistency. To take two examples from the domain autonomous vehicles, both the Responsibility Sensitive Safety (RSS) model proposed by Intel/Mobileye  and the Instantaneous Safety Metric (ISM) from the National Highway Traffic Safety Administration  are simple instances of this class.
The evolution of envelopes with time can be predicted with models constructed from classical control theory model-based approaches. But one challenge with such approaches is that it is not always possible or convenient to use on real-world nonlinear system notably because there are computationally expensive. The construction of dynamic models purely based on data ,  or mixing data and model-based approaches – is a topic of very active research. Diverse methods from recent types of finite-data LTI identification  to Gaussian Processes ,  or Kernel-based estimators ,  are available to address the problem by incorporating knowledge extracted from data. When it comes to use envelope-based models to monitor AI components, data-driven approaches derived from machine learning offer promising solution to monitor systems in real time notably with applications in robotic , cyber-physical systems , .
Proposed methodological approach
The challenge to address in this thesis work is to introduce machine learning technique in a hybrid approach mixing data and model from control theory to monitor the state of the system in real-time. Beforehand, different types of anomaly profiles need to be formalized in order to capture the desired properties and trustworthiness guarantees. The goal is to develop a hybrid data-driven and model-based approach using envelope based-models to detect abnormal behavior based on extrapolation in a runtime monitoring system.
Experiments, use cases
We will investigate the use of such envelope-based models using data-driven approaches for single and multi-step predictions. The goal will be to propose a methodological approach applicable in a runtime monitoring system.
It is planned to apply the results of this research to use cases proposed by the national Grand Défi Trust IA. An example of use case is, but not limited to, the monitoring of a demand forecasting model based on multivariate time series proposed by Air Liquide. One considered direction in the thesis work would be to apply in a second time the proposed approach to monitoring AI component based on video stream.
- Do a comprehensive state of the art of the relevant data-driven methods, rule-driven methods and hybrid methods (data-driven + rule-driven) that could be applied for monitoring application.
- Characterize the different anomaly profiles of interest in the national Grand Défi Trust IA and formalize the associated properties to be monitored (define guarantees for trustworthiness).
- Benchmark the methods identified in the state of the art, notably approaches proposed by  combining control theory and machine learning technics, on at least one use case provided by the industrial partners in the Grand Défi Trust AI.
- Identify strengths and limits of the benchmarked methods and find enhancements to existing methods or new hybrid methods to reach better monitoring results in terms of completeness of the detection and performance criteria (response time, memory footprint, CPU usage, etc.). A special attention will be given to envelope-based models which bound future possible states based on past observations, possibly associated with probabilities.
- Development of a hybrid monitor in relation with the use cases proposed by the national Grand Défi Trust IA
-  B. Gassmann et al., ‘Towards Standardization of AV Safety: C++ Library for Responsibility Sensitive Safety’, in 2019 IEEE Intelligent Vehicles Symposium (IV), Jun. 2019, pp. 2265–2271. doi: 10.1109/IVS.2019.8813885.
-  J. L. Every, F. Barickman, J. Martin, S. Rao, S. Schnelle, and B. Weng, ‘A Novel Method to Evaluate the Safety of Highly Automated Vehicles’, presented at the 25th International Technical Conference on the Enhanced Safety of Vehicles (ESV)National Highway Traffic Safety Administration, 2017. [Online]. Available: https://trid.trb.org/view/1485370
-  A. Devonport, F. Yang, L. E. Ghaoui, and M. Arcak, ‘Data-Driven Reachability Analysis with Christoffel Functions’, arXiv:2104.13902 [cs, eess], Apr. 2021, [Online]. Available: http://arxiv.org/abs/2104.13902
-  A. Devonport and M. Arcak, ‘Data-Driven Reachable Set Computation using Adaptive Gaussian Process Classification and Monte Carlo Methods’, arXiv:1910.02500 [cs, eess], Oct. 2019, [Online]. Available: http://arxiv.org/abs/1910.02500
-  S. Haesaert, P. M. J. Van den Hof, and A. Abate, ‘Data-driven and model-based verification via Bayesian identification and reachability analysis’, Automatica, vol. 79, pp. 115–126, May 2017, doi: 10.1016/j.automatica.2017.01.037.
-  A. Alanwar, A. Koch, F. Allgöwer, and K. H. Johansson, ‘Data-Driven Reachability Analysis from Noisy Data’, arXiv:2105.07229 [cs, eess], May 2021, [Online]. Available: http://arxiv.org/abs/2105.07229
-  J. F. Fisac, A. K. Akametalu, M. N. Zeilinger, S. Kaynama, J. Gillula, and C. J. Tomlin, ‘A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems’, arXiv:1705.01292 [cs], Feb. 2018, Accessed: Jun. 15, 2021. [Online]. Available: http://arxiv.org/abs/1705.01292
-  J. Coulson, J. Lygeros, and F. Dörfler, ‘Data-Enabled Predictive Control: In the Shallows of the DeePC’, arXiv:1811.05890 [math], Mar. 2019, Accessed: Jun. 16, 2021. [Online]. Available: http://arxiv.org/abs/1811.05890
-  K. Polymenakos et al., ‘Safety Guarantees for Planning Based on Iterative Gaussian Processes’, arXiv:1912.00071 [cs, stat], Sep. 2020, [Online]. Available: http://arxiv.org/abs/1912.00071
-  B. Beckermann, M. Putinar, E. B. Saff, and N. Stylianopoulos, ‘Perturbations of Christoffel-Darboux kernels. I: detection of outliers’, arXiv:1812.06560 [math], Apr. 2019, [Online]. Available: http://arxiv.org/abs/1812.06560
-  D. Nguyen-tuong, J. R. Peters, and M. Seeger, ‘Local Gaussian Process Regression for Real Time Online Model Learning’, p. 8.
-  R. Grbic, D. Sliskovic, and P. Kadlec, ‘Adaptive soft sensor for online prediction based on moving window Gaussian process regression’, in 2012 11th International Conference on Machine Learning and Applications, Boca Raton, FL, Dec. 2012, pp. 428–433. doi: 10.1109/ICMLA.2012.160.
-  R. E. Allen, A. A. Clark, J. A. Starek, and M. Pavone, ‘A machine learning approach for real-time reachability analysis’, in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, Chicago, IL, USA, Sep. 2014, pp. 2202–2208. doi: 10.1109/IROS.2014.6942859.
Candidates must hold a master or engineering degree with a strong academic background related to either control theory or machine learning and should be ready to deep dive into the other domain.
Knowledge and know-how:
- Fundamentals of feedback control (Kalman filters, linear systems)
- Basic knowledge of statistics and probability theory
- Basics in any of the programming languages Python, C/C++, or Matlab
- Collaboration and teamwork
- Curiosity and proactivity
- Willingness to learn