by Aparna Nathan

Hospitals are churning out medical data at an unprecedented rate. 153 billion gigabytes of health care data were produced in 2013, and we’re expected to reach 2300 billion gigabytes per year by 2020. That’s almost 9 billion MacBooks’ worth of storage each year, not even counting the hundreds of thousands of genomes sequenced each year. It’s more than a human can process ­­- but not too much for computers.

Computational biomedicine is a new player in the world of medicine. Broadly speaking, it is the application of computational methods to aid in the diagnosis and treatment of disease. Take, for example, a patient in a hospital. At every step, from diagnosis to treatment to follow-up, there is potential for computers to provide information that a health care provider cannot. But don’t worry, humans aren’t replaceable yet.

Step 1: A doctor-less diagnosis?

Let’s say you wake up one morning and notice a new mole on your arm. Your first thought? It might be cancer. You’d likely hurry into your doctor’s office seeking a professional opinion.

As a patient, you can only observe your symptoms, but a more definitive diagnosis requires someone with access to specialized knowledge. Not just textbook-level medical information either — you can get that on WebMD. A doctor’s unique asset is prior experience. As a patient, you hope that the doctor can recognize patterns from previous patients and make a prediction for your own health. Using past observations to predict future outcomes is just the kind of task that machine learning was made to perform.

Machine learning is one application of artificial intelligence, the use of computers to carry out tasks that humans usually do. In machine learning, the task at hand is classification (Figure 1). Computers classify new data through a process of pattern recognition, just like doctors making diagnoses based on having seen the same set of symptoms before.

Figure 1: Machine learning for diagnosis Machine learning is an iterative process. First, the algorithm uses a training dataset with known categories to learn which features can be used to predict classifications. Then, this algorithm is used to classify a set of test data and measure the method’s accuracy. Finally, the trained and tested algorithm is ready to be applied to real data.

One major pattern-recognition task is the interpretation of medical imaging, like X-rays or photos of abnormalities. For example, researchers from Stanford have used a machine-learning algorithm to identify cases of melanoma. They use a training set of images of skin lesions known to be cancerous and benign. The algorithm can “learn” the features that distinguish the two categories and use these patterns to classify new images, almost taking on the role of dermatologist. Similar studies have shown that algorithms can identify damage in images of retinas as accurately as a doctor can. As these algorithms continue to develop and improve, they may help standardize the diagnostic process, particularly in medically underserved areas where patients cannot easily access doctors. Eventually, these machine learning algorithms may also be able to surpass the accuracy of doctors by seeing and remembering patterns too complex for the human mind, thereby avoiding diagnostic errors that can lead to medical complications and patient deaths.

Step 2: Hitting the biological bullseye

 Once you receive your diagnosis, your next questions will likely be about treatment. The new paradigm of precision medicine suggests that not every patient will respond to the same treatment. Your doctor can use information about your disease, medical history, and genome to tailor a treatment specific to you.

This approach has found especially firm footing in cancer treatment, where certain molecular markers influence tumors’ susceptibility to certain therapies. One hallmark example of precision medicine at work is the case of the BRAF gene. Normally carefully-controlled, BRAF instructs cells to grow and divide when appropriate. When BRAF is mutated, skin cells can grow uncontrollably, leading to melanoma. The advantage of knowing just how BRAF causes cancer is that we suddenly have a target. If we hit it, then bullseye — the tumor’s growth can be curbed. In 2011, the FDA approved vemurafenib, a drug that specifically inhibits BRAF.

It’s not always so simple, though. Often, we don’t know the exact root cause of the disease, as sequencing a patient’s tumor can reveal hundreds of mutations. Unfortunately, this tells us nothing about which mutations are actually important for driving the disease or which mutations should be targeted by precision medicine. That’s where computational methods come in handy. Algorithms can “see” patterns that are too complex for humans to manually interpret. A recent study in Cell Reports developed a machine learning algorithm that identified patterns in the DNA and RNA of tumors known to respond to certain drugs and used these patterns to predict which other tumors would respond to the same drugs. As such, computational biomedicine could be invaluable for choosing the best possible treatment for each individual patient.

Step 3: Medical data with life of its own

Diagnosed and treated, you may be ready to leave the hospital, but your data stays behind. The medical record rooms of yesteryear have been largely replaced by electronic health records, or EHRs. These are a streamlined, easily transferrable, and far more space-efficient alternative. They serve as a treasure trove of patient data, ranging from test results to doctors’ notes and prescriptions that help doctors better serve the patient in the future (Figure 2).

Figure 2: Medical data in its various forms Medical data can range from quantitative measurements (like blood tests) to qualitative observations (like features identified in X-rays).

But when researchers look at EHRs, they see a pre-assembled cohort of potential study subjects with detailed medical histories. Much of the promise of EHRs in research isn’t just in the diagnosis of disease. It’s in prediction. We know that people develop diseases, but we don’t always know why. To answer that question, we have to look back at what happens before a person gets sick. With decades of data on millions of people’s health, EHRs can help us do just that.

In a study carried out at Stanford a few years ago, researchers used EHR data to identify various cardiovascular risk factors for atrial fibrillation, a heart condition that can be fatal. Other studies have used EHRs to identify unexpected prognostic markers that precede alcoholism and acute lymphoblastic leukemia, which could help catch these potentially-fatal diseases early.

EHRs aren’t perfect, though; there can be a lot of missing or uninterpretable data, and records from different institutions may not even resemble each other. Privacy measures — like de-identification of data and consent requirements — protect patients, but these measures are forced to evolve rapidly alongside the technology, obscuring the data for both patients and researchers.

Digital horizons of medicine

 What might be different if you return to the hospital in 10 years? As technology gets less expensive, it will become easier to collect and interpret genomic and molecular data in a clinical setting. In their paper, the Stanford team envisions a future where you can snap a photo of a suspicious mole on your phone, use the machine-learning algorithm to identify whether or not it’s cancerous, and get medical attention if needed. But medical professionals have countered that machine learning can never fully replace a doctor’s diagnosis. As physician and writer Sid Mukherjee wrote in the New Yorker, “The most powerful element in [these] clinical encounters, I realized, was not knowing that or knowing how. It lay in yet a third realm of knowledge: knowing why.”

Maybe robo-doctors aren’t the answer. At least at this moment, it seems like there will still be a place for humans in computational biomedicine. After all, computers see health and sickness in terms of perfect rules and patterns. In reality, though, we experience these conditions in an imperfect, inherently human way.

Aparna Nathan is a first year graduate student in the Bioinformatics and Integrative Genomics PhD program at Harvard University.

For more information:

  • To learn more about artificial intelligence in a medical setting, check out Sid Mukherjee’s New Yorker article
  • For information on deep-learning algorithms, check out this Nature Technology Feature
  • To learn about All of Us, a program that aims to amass patient data to improve health care, check out this description from the National Institute of Health.
  • For more information on electronic health records, see this article from the New York Times

This article is part of the 2018 Special Edition — Tomorrow’s Technology: Silicon Valley and Beyond 

Leave a Reply

Your email address will not be published.