Machine Learning in Medical Imaging and Analysis


By Pawel Godula, Director of Customer Analytics,

Machine learning is useful in many medical disciplines that rely heavily on imaging, including radiology, oncology and radiation therapy.

According to IBM estimations, images currently account for up to 90% of all medical data. Due to recent advancements, image recognition, especially with transfer learning done with networks pre-tuned on an ImageNet dataset, provides interesting possibilities to support medical procedures and treatment.

Pawel Godula, Director of Customer Analytics,

Artificial intelligence startups are being acquired at an increasing rate, while the value of AI healthcare-related equipment is also growing rapidly. As Accenture estimates show, the market is set to register an astonishing compound annual growth rate (CAGR) of 40% through 2021. Meanwhile, the market value of AI in healthcare is projected to skyrocket from $600M in 2014 to $6.6B in 2021.

Automated image diagnosis in healthcare is estimated to bring in up to $3B. Unlike many improvements that have been made in healthcare, AI has promise to help hold down health care costs. It can tackle common image-related challenges and automate heavy data-reliant techniques, which are usually both time-consuming and expensive.

Data labelling and a skills gap

One of the most significant challenges in image recognition is the labor-intensive data labelling that precedes the building of any new image recognition model. See our recent blog post concerning transfer learning.

Fortunately, some medical image data is spared. Radiological descriptions, for example, are standardized, applying a golden format to apply machine learning algorithms due to the labeling of data and enforcing order within the dataset. A challenge in modern radiology is to use machine learning to automatically interpret medical images and describe what they show. However, as the history of ImageNet shows, providing the properly labeled dataset is the first step in building modern image recognition solutions.

According to the American Journal of Roentgenology, if machine learning is to be applied successfully in radiology, radiologists will have to extend their knowledge of statistics and data science, including common algorithms, supervised and unsupervised techniques and statistical pitfalls, to supervise and correctly interpret ML-derived results. To address the skills gap among radiologists, companies that can handle the data science side of the equation, including teaching it, will be among the best solutions.

The rise of radiogenomics

Combining different types of imaging data with genetic data could bring about better diagnostics and therapy – and potentially be used to uncover the biology of cancer. The new discipline of radiogenomics connects images with gene expression patterns and methods to map modalities. The paper entitled decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach describes an example of the process.

Interestingly, both image recognition (IR) and natural language processing (NLP) techniques can be used to analyze genetic data. Image recognition can be applied when the genomic data presents a one-dimensional picture consisting of colors representing each gene. The algorithms used are similar to any other image recognition approach. As machine learning models consider size irrelevant, among other factors, models may shape up to be similar to the one described in our recent blog post. NLP is used when the genes are represented by letters. While it is inferior to image recognition in looking for patterns and general analysis, NLP is better at seeing “the bigger picture” and looking for longer patterns present in larger sequences of genes.

Machine learning in precision radiation oncology

Radiogenomics is also an emerging discipline in precision radiation oncology. Machine learning approaches can be used to study the impact of genomic variations on the sensitivity of normal and tumor tissue to radiation.

Radiation oncology is particularly well suited for applying machine learning approaches due to the enormous amount of standardized data gathered in time series. Radiotherapy involves several stages encompassing the entire oncological treatment:

  1. patient assessment,
  2. simulation, planning,
  3. quality assurance,
  4. treatment delivery,
  5. follow-up

All these stages can be supported and enhanced with machine learning. Tumors may have subregions of different biology, genetics and response to treatment. Thus, it is crucial to find spaces on images that need to be radiated with lower doses to make the therapy more precise and less toxic.

Building medical image databases – a challenge to overcome

Having access to proper datasets is a challenge to be tackled in medical image analysis. To gain insight into the mechanism and biology of a disease, and to build diagnostic and therapeutic strategy with machine learning, datasets including imaging data and related genetic data are needed.

According to Advances in Radiation Oncology, there are numerous databases and datasets containing healthcare data, yet they are not interconnected. Gaining high quality datasets containing medical data is quite a challenge and there are very few such datasets available. A collection containing images from 89 non-small cell lung cancer (NSCLC) patients that were treated with surgery is one of very few examples. For those patients, pretreatment CT scans, gene expression, and clinical data are available.

Efforts to build proper databases to support analysis of imaging data are being made. ePAD is a freely available quantitative imaging informatics platform, developed at Stanford Medicine Radiology Department. Thanks to its plug-in architecture, ePAD can be used to support a wide range of imaging-based projects. Also, TCIA is a service that hosts a large number of publicly available of medical images of cancer. The data are organized as collections including:

  • patients related by a common disease,
  • image modality (MRI, CT, etc.),
  • research focus.

Advances have already been made in histological image analysis and its clinical interpretation. work has proved that it is possible to accurately analyze and interpret the medical images in diabetic retinopathy diagnosis. built its model in cooperation with California Healthcare Foundation and a dataset consisting of 35,000 images provided by EyePACS.

Using this technique is more common. A machine learning approach reveals latent vascular phenotypes predictive of renal cancer outcome based on analysis of vessels in histological images. Vascular phenotype is related to biology of cancer. Forming new vessels is kind of a predictor–biomarker for potential of cancer development.

Modern equipment

The effectiveness of machine learning in medical image analysis is hampered by two challenges:

  • heterogeneous raw data
  • relatively small sample size

For prostate cancer diagnosis, these two challenges can be conquered by using a tailored deep CNN architecture and performing an end-to-end training on 3D multiparametric MRI images with proper data preprocessing and data augmentation.

Attempts have been made to apply machine learning image analysis in clinical practice. Studies show that numerous use cases in clinical practice could be supported with machine learning. For example, on the basis of the Mura Dataset from the Stanford ML Group, it has been shown that baseline performance in detecting abnormalities on finger studies and equivalent wrist studies is on a par with the performance of radiologists. However, the baseline performance of convolutional networks comes in lower than that of the best radiologists in detecting abnormalities on the elbow, forearm, hand, humerus, and shoulder.

Numerous cases, including’s right whale recognition system, show that it is possible to tune a model enough to perform well on a limited dataset. Thus, the prospects for building models that outperform human doctors in detecting abnormalities are tantalizing.

An interesting practical example comes thanks to the paper a deep convolutional neural network-based automatic delineation strategy for multiple brain metastases stereotactic radiosurgery. Precise brain metastases targeting delineation is a key step for efficient stereotactic radiosurgery treatment planning. In the paper, an algorithm was used to segment brain metastases on contrast-enhanced magnetic resonance imaging datasets. Developing tools to support delineation of critical organs could save medical doctors a lot of time.

Summary – future savings with AI

Applying AI in medical image analysis brings significant advantages, including lower costs and further steps towards automating the diagnosis process. According to The Lancet, global healthcare spending is predicted to increase from $9.21 trillion in 2014 to $24.24 trillion in 2040. The spending is predicted to increase both in developing countries due to improving access to medical treatment, and in developed countries facing the challenge of providing care for their aging populations.

As a business, healthcare is unique because its provision is not measured solely by income. Potential savings and the ability to provide treatment for larger groups of people are better measures of the importance of AI to healthcare. According to Healthcare Global, AI is predicted to bring up to $52 billion in savings by 2021, enabling care providers to manage their resources better. A significant part will come from leveraging image recognition, as earlier diagnosis translates into lower treatment costs and greater patient well-being, as was clearly shown in this WHO study.

In sum, advances in image recognition combined with advances in AI should help the healthcare system achieve greater efficiencies moving forward.

For more information, go to