“Prediction is very difficult, especially if it’s about the future!”
“Prediction is very difficult, especially if it’s about the future!”
Niels Bohr, Nobel laureate
Modern condition-based maintenance (CBM) relies heavily on the knowledge of the future state of an asset (prognostics and health management – PHM) [1]. This can reduce the risk of (serious) injury and annoyance, allows for saving resources and facilitates predictive maintenance. That is, accurate planning and scheduling of maintenance to allow maximum safety and asset usability, as well as cost savings. For example, in view of this, the airline carrier, MROs (Maintenance, Repair & Operations) and engine OEMs (Original Equipment Manufacturers) invest time and resources into transforming operational data into analytics [1,2,3,4] in order to answer a simple question: how long will my asset continue to operate normally/safely? Or, to put it otherwise, what is its remaining useful life (RUL)? However, as simple this question may be, the answer has a lot of moving parts.
One important aspect of this problem is to define what is useful and to define what a failure is. In addition, the next step is using an appropriate method in order to estimate the RUL, something which is not a straightforward task, as it depends on a lot of factors, such as the type and quantity of the available data, the computational resources and other end-user constraints. But, even after these aspects have been considered, there is still an important question to be answered. Just estimating the RUL is not enough. What we also need is to answer the question of how sure we are of our prediction. For example, how sure are we that our car will need a new alternator in 6 weeks? After all, when dealing with safety-critical and operation-critical questions we would like to be sure (as sure as one can be) about the future.
In general, there are two main sources of uncertainties that should be taken into account. The epistemic or knowledge/model/systematic uncertainty and the aleatoric or data/statistical uncertainty [5,6]. The former source of uncertainty refers to uncertainty caused by a lack of knowledge or ignorance. This can be due to a model that cannot incorporate the available information or due to inadequate data and knowledge. In principle, this type of uncertainty can be reduced by more data, especially in the areas where the observations are sparse and where adequate models are used. Aleatoric uncertainty on the other hand, refers to the natural stochasticity of the observations. This uncertainty is not a property of the modeling process, but instead is inherent of the data collection process (e.g., measurements). For example, a sensor device which collects observations has by construction a finite accuracy which introduces uncertainty in the predictions it makes or, for example, in coin flipping, the data-generating process has a stochastic component that cannot be reduced by more information. Thus, epistemic uncertainty refers to the reducible part of the total uncertainty and aleatoric uncertainty refers to the irreducible part. In Figure 1 we show a schematic representation that makes this distinction clear.

In CIMPLO we develop machine learning (ML) and artificial intelligence (AI) methods for predictive maintenance (data drive methods). In this view, we research methods to account for the uncertainty and we incorporate these in our models. When we develop a predictive maintenance pipeline, we would like to know how confident we are in our prediction. For example, if we build a model to anticipate a failure of a gas turbofan engine it is important to know whether our model is confident in its decision. It could be the case that the model has only been trained on specific operation instances. In that case, if we pass through the model an instance that is “far away” from what it has “seen before”, we expect that the model will give a prediction with a high degree of uncertainty. This contrasts with when we “feed” the model a familiar instance. In that case, we expect a prediction with a very small degree of uncertainty. Thus, given a prediction with high uncertainty the end-user can, for instance, investigate more and through alternative methods that specific case. There are various ways that one can account for uncertainty in ML and AI methods: markov chain monte carlo (MCMC), variational inference (VI), monte carlo (MC) dropout, bayesian deep learning (BDL)/bayesian neural networks (BNNs), bayesian active learning (BAL), bayes by backpropagation (BBB), variational autoencoders (VAE) and deep gaussian process (DGP), to name a few [5].
In conclusion, a trustworthy representation of uncertainty is pivotal and should be considered as a key feature for any machine learning (ML) or artificial intelligence (AI) method. This is the case even more in safety-critical and operation-critical domains such as aerospace, automotive, energy and medicine.
By: Marios Kefalas
References:
[1]: Nguyen, V. D., Kefalas, M., Yang, K., Apostolidis, A., Olhofer, M., and Limmer, S., “A Review: Prognostics and Health Management in Automotive and Aerospace”, International Journal of Prognostics and Health Management, 2019, p. 35.
[2]: “Predix Platform | Industrial Cloud Based Platform (PaaS) | GE Digital | GE Digital”, 2021. URL: https://www.ge.com/digital/iiot-platform.
[3]:“IntelligentEngine”, 2021. URL: https://www.rolls-royce.com/products-and-services/civil-aerospace/intelligentengine.aspx.
[4]: “The MRO Lab – PROGNOS®- Predictive Maintenance”, 2021. URL https://www.afiklmem.com/en/solutions/about-prognos
[5]: M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu, M. Ghavamzadeh, P. Fieguth, A. Khosravi, U. R. Acharya, V. Makarenkov et al., “A review of uncertainty quantification in deep learning: Techniques, applications and challenges,” arXiv preprint arXiv:2011.06225, 2020.
[6]: Eyke Hüllermeier and Willem Waegeman. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods, 2020