Terminology#
Explainability or Interpretability?
There is no standard and generally accepted definition, and sometimes people use these two terms interchangeably. Nonetheless, we believe there is a difference between the two.
The Royal Society defines:
Interpretability: implies some sense of understanding how the technology works
Explainability: implies that a wider range of users can understand why or how a conclusion was reached
In more general terms, when we speak about interpretability, we aim to understand exactly why and how the model is generating predictions by observing the inner mechanics of the AI/ML method. This usually implies that we are dealing with a glass-box model. On the other hand, when we speak about explainability, we focus on the decision-making process of a black-box model and try to explain the behaviour in human understandable terms.
In this course, we will not focus on intrinsic methods that are interpretable by construction, but we will only focus on explainability, looking at some of the main popular post hoc methods.
References#
The Royal Society. Explainable AI: The basics. Policy Briefing. 2019.