Variations in specific genes, or regions of DNA, give rise to countless aspects of the characteristics of an individual. While certain traits result from single genes, like blood type, more complex traits such as height or predisposition for diseases result from interactions amongst thousands of genes. As not all these genes and the variants between gene copies have been identified or their effects isolated, prediction of these complex traits is challenging. With the recent successes in machine learning (the process of getting computers and algorithms to learn and improve on their own), however, analysis of thousands of genes from thousands of individuals, and subsequent prediction of traits based on one’s genome, is achievable.

Researchers at Michigan State University have applied machine learning to such a scenario by training an algorithm to predict height based on variations in 100,000 specific genes using data from roughly 500,000 individuals (this is known as a ‘training data set’ or ‘training group’). The algorithm was able to successfully predict the height of individuals, both from the training group and outside of it, within approximately one inch, based solely on their genes. The model was further validated using data from a separate survey.

Despite small errors between predicted and actual heights, this study demonstrates the potential for machine learning algorithms in predicting complex traits, such as risks of disease. The algorithm could also identify potential genetic targets for treatment. As the algorithm was only trained and tested on a relatively homogeneous group (individuals aged 40-69 of European descent), further validation is needed to prove the potential to generalize to more genetically diverse populations. Additionally, standardized methods for reporting genetic data will be required to allow for integration of data from multiple sources, as independent sources have their own data collection procedures.

Managing Correspondent: Andrew T. Sullivan

Press Articles: New DNA tool can predict people’s height and potentially assess risk for serious illnesses

A New DNA Tool Can Predict Height

New DNA tool predicts height, shows promise for serious illness assessment

Original Journal Article: Accurate Genomic Prediction of Human Height, Genetics

Image Credit: Pixabay

One thought on “A Tall Order: Using Machine Learning to Predict Height from Genetic Variation

  1. A gene is the basic physical and functional unit of heredity. Genes are made up of DNA. Some genes act as instructions to make molecules called proteins. Thus genes can predict the body structure of living animals.

    Great articles which predict the livings of human life.

Leave a Reply

Your email address will not be published.