Drug Discovery Accelerated By Using Deep Neural Networks
⯀ Recent Intel collaboration with Novartis on the use of deep neural networks to accelerate high content screening – a key element of early drug discovery has cut time to train image analysis models from 11 hours to 31 minutes – an improvement of greater than 20 times.
In the medical field, one of the hopes of deep learning is that relevant image features that can distinguish one treatment from another can be "automatically" ascertained from the data. By applying deep neural network acceleration, biologists and data scientists at Intel and Novartis hope to speed up the analysis of high content imaging screens.
The work has been published in the journal Nature Methods.
In this joint work, the team is focusing on whole microscopy images as opposed to using a separate process to identify each cell in an image first. Whole microscopy images can be much larger and higher resolution than those typically found in deep learning datasets.
The images used in this evaluation are more than 26 times larger than images typically used from the well-known ImageNet dataset of animals, objects and scenes.
Deep convolutional neural network models, for analyzing microscopy images, typically work on millions of pixels per image, millions of parameters in the model and possibly thousands of training images at a time. That constitutes a high computational load. Even with advanced computational capabilities on existing computing infrastructure, deeper exploration of DNN models can be prohibitive in terms of time.
To solve these challenges, the collaboration is applying deep neural network acceleration techniques to process multiple images in significantly less time while extracting greater insight from image features that the model ultimately learns.
The collaboration team with representatives from Novartis and Intel have shown more than 20 times1 improvement in the time to process a dataset of 10K images for training. Using the Broad Bioimage Benchmark Collection 021 (BBBC-021) dataset, the team has achieved a total processing time of 31 minutes with over 99 percent accuracy.
For this result, the team used eight CPU-based servers, a high-speed fabric interconnect, and optimized TensorFlow1. By exploiting the fundamental principle of data parallelism in deep learning training and the ability to fully utilize the benefits of large memory support on the server platform, the team was able to scale to more than 120 3.9-megapixel images per second with 32 TensorFlow workers.
Related articles
Supervised deep learning methods are essential to accelerating image classification
and speeding time to insight, however current deep learning methods depend on large
expert-labeled datasets to train the models.The time and manual effort necessary to create such datasets is often prohibitive. Unsupervised deep learning methods – that may be applied to unlabeled microscopy images – may hold the promise of revealing novel insights for cellular biology and ultimately drug discovery. This will be the focus of continuing efforts in the future.
Comments
Post a Comment
Thank you for your comment!