We cannot predict how long we each live, but can our genes? For as long as longevity has been a desirable good, it has never been equally distributed across humanity, not even within families. The role of heritable traits in longevity is still debated. Previous genomic studies have reported a low heritability for longevity. However, inadequate sample sizes prevent these studies from examining the influence of environmental factors, for example, and therefore their conclusions are often incomplete or inconclusive.

A new study led by computer scientists at Columbia University uses a novel approach to probe this question: crowdsourcing data. Extracting 86 million profiles from genealogy-driven social media (Geni.com), they construct a family tree of 13 million individuals, spanning 11 generations on average. This population-scale data set is large enough that robust statistical methods are trustworthy. Its large size also gives us the opportunity to model the effects of environmental factors and war, for example, on longevity. With these factors accounted for, the study finds that heredity only affects 16% of differences between individuals’ longevity – much below the literature value of 25%. By closing the loopholes in previous works, this new study suggests we should expect an even lower genetic predictability of human lifespans.

With our fast-growing abilities to manipulate big data sets and extract meaningful insights, the sources of big data sets will become more important in future scientific pursuits. Crowdsourcing is a fast way to gather vast amounts of information at a low cost, but it is difficult to know how much we can trust information available on the web. Before crowdsourcing becomes a mainstream way of gathering data, we will need to construct and impose standards regarding how and where this information is collected.


Managing Correspondent:

Hechen Ren

Original Research Article:

Quantitative analysis of population-scale family trees with millions of relatives – Science

Media Coverage:

Colossal family tree reveals environment’s influence on lifespan Nature     

When Did Americans Stop Marrying Their Cousins? Ask the World’s Largest Family TreeNew York Times   


One thought on “Crowdsourced Data Helps Scientists Construct the World’s Largest Family Tree

  1. I don’t know whether this is a foolish comment or an interesting off shoot from your blog post. Your correlation with DNA and longevity seems to be a part of the popular new series ‘Altered carbon’. There the humans have their DNA and life essence captured in a kind of chip called a sleeve that enables them to leave a body that has grown too old or got damaged or become lifeless, and get fitted (resleeved as they say in the series) into a new one. Thanks for the interesting post…

Leave a Reply

Your email address will not be published. Required fields are marked *