Interpretable deep learning for liver tumor diagnosis

Paper (European Radiology) | Paper (SPIE Medical Imaging) | Code

Hepatic lesion classification is a challenging task that requires identifying subtle radiological features. A convolutional neural network can achieve comparable performance to radiologists on this task, but how can we interpret what it looks at?

The difference between malignant and benign lesions on liver MRI often comes down to fairly subtle image features.

By asking radiologists or scraping reports, we can find a collection of radiological features that are commonly used to support a particular diagnosis.

We determine whether our CNN pays attention to each of these radiological features by examining the distribution of CNN activations. We can use the correlation between CNN activations and each radiological feature as a proxy for the degree that the network is able to detect that feature. The influence of those activations on the predicted class tells us how much the CNN weighs that feature in making its prediction.

Performing logistic regression on the CNN activations tells us which features the CNN struggles to detect.

Predictive power for radiological features reveals a model's strengths and weaknesses.

This method also helps highlight mistakes in the model. When the CNN activations point to radiological features that are consistent with the model’s predictions, we find that the model also tends to produce better predictions.

Bad explanations often lead to wrong diagnoses.

We can combine this method with other interpretability techniques such as saliency maps to give a holistic picture of what the CNN pays attention to.

Read our papers in European Radiology and SPIE Medical Imaging for more details, or check out our code


  title={A probabilistic approach for interpretable deep learning in liver cancer diagnosis},
  author={Wang, Clinton J and Hamm, Charlie A and Letzen, Brian S and Duncan, James S},
  booktitle={Medical Imaging 2019: Computer-Aided Diagnosis},