"As machine learning becomes ubiquitous and is used for applications with more serious consequences, there's a need for people to understand how it's making predictions so they'll trust it when it's doing more than serving up an advertisement," says Jonathan Su, a member of the technical staff in MIT Lincoln Laboratory's Informatics and Decision Support Group.
Currently, researchers either use post hoc techniques or an interpretable model such as a decision tree to explain how a black-box model reaches its conclusion. With post hoc techniques, researchers observe an algorithm's inputs and outputs and then try to construct an approximate explanation for what happened inside the black box. The issue with this method is that researchers can only guess at the inner workings, and the explanations can often be wrong. Decision trees, which map choices and their potential consequences in a tree-like construction, work nicely for categorical data whose features are meaningful, but these trees are not interpretable in important domains, such as computer vision and other complex data problems.
A neural network is a computing system composed of many interconnected processing elements. These networks are typically used for image analysis and object recognition. For instance, an algorithm can be taught to recognize whether a photograph includes a dog by first being shown photos of dogs. Researchers say the problem with these neural networks is that their functions are nonlinear and recursive, as well as complicated and confusing to humans, and the end result is that it is difficult to pinpoint what exactly the network has defined as "dogness" within the photos and what led it to that conclusion.
To address this problem, the team is developing what it calls "prototype neural networks." These are different from traditional neural networks in that they naturally encode explanations for each of their predictions by creating prototypes, which are particularly representative parts of an input image. These networks make their predictions based on the similarity of parts of the input image to each prototype.
The other area the research team is investigating is BRLs, which are less-complicated, one-sided decision trees that are suitable for tabular data and often as accurate as other models. BRLs are made of a sequence of conditional statements that naturally form an interpretable model. For example, if blood pressure is high, then risk of heart disease is high. Su and colleagues are using properties of BRLs to enable users to indicate which features are important for a prediction. They are also developing interactive BRLs, which can be adapted immediately when new data arrive rather than recalibrated from scratch on an ever-growing dataset.
Su explains: "We're hoping to build a new strategic capability for the laboratory—machine learning algorithms that people trust because they understand them."