by Gemma Johnson
figures by Olivia Foster
You might remember being in a biology class where molecular biology is often represented by cartoons in bright colors on the pages of a textbook. What’s happening inside our cells, however, is much more complicated than what’s depicted in those caricatures. The DNA molecules that constitute our genetic code look like a twisted ladder (the formal name is a “double helix”), and the amount of DNA in one human cell (the genome) would be about two yards long when stretched out. Yet, all of it is contained in a nucleus that is about two ten-thousandths of an inch across. For all two yards of information to fit inside such a small space, it must be tightly wound and folded around special proteins in the nucleus (Figure 1A). DNA adopts a complex 3D structure—like a ball of lint or a tumbleweed—but scientists have only recently begun to determine what these structures look like using new DNA sequencing technology.
DNA molecules are composed primarily of four chemicals: adenine (A), tyrosine (T), cytosine (C), and guanine (G). These four chemicals spell out the “words” in DNA that make up genes. DNA sequencing reveals the order of A, T, C, and G in a strand of DNA and has led to a deluge of information about the 1D sequence of DNA (i.e. the order of the letters). Technologies that have been recently developed now allow scientists to use sequencing to go beyond the 1D sequence and see the 3D structure of DNA as it actually exists within a nucleus. Research into the 3D structure of DNA has not only led to a new understanding of how genes organize themselves in 3D space, but also to the development of new medical applications like non-invasive prenatal testing for genetic diseases.
How can we see the 3D structure of DNA?
In 2002, an international research team published a method called 3C that explores the 3D structure of yeast chromosomes by identifying stretches of DNA that are close to one another. Even at the time, it was clear that this technology would apply to any species with a sequenced genome. Scientists have now sequenced the genomes of hundreds of organisms, such as bacteria, yeast, flies, humans, and zebrafish (you can browse a list of such organisms here). Since the original publication, 3C has been modified and expanded to new methods such as 4C, 5C, and Capture-C that can identify and assign connections over larger segments of DNA than ever before. One method called Hi-C is capable of mapping the connections of an entire genome. These connections affect which genes are turned on or off in a cell.
Notwithstanding these recent advances, the basic principles of all 3C-derived methods have remained the same (Figure 1B). First, the DNA inside of cells is cross-linked, meaning pieces of DNA that are close to each other will be chemically connected. After a few more steps of processing and purification, these linked pieces of DNA are sequenced. Next, computational analysis reveals the pieces of DNA that contact each other more frequently than would be expected by chance.
What does the 3D structure of DNA look like in different types of cells?
Every cell in a given organism has the same DNA, but different cell types use the information encoded by the DNA in different ways. The specific genes that our cells turn on and off are highly coordinated to make sure that cells are doing what they are supposed to do. Imagine an instruction manual for building either a chair or a table from the same materials: if you want to make a chair, then you will use one part of the instruction manual, and if your friend wants to build a table, then she will use the other set of instructions. If you somehow end up reading the wrong instructions, you’ll end up with the wrong result. The same general concept applies to cells—DNA is the universal instruction manual, and different combinations of genes define different types of cells.
Intuitively, the DNA interactions in an embryonic stem cell and those in an adult brain cell should be very different, and the 3D structure of their genomes should reflect this disparity. Surprisingly, researchers using Hi-C found that different mouse cell types have similar 3D structures. This result challenges the view that the interactions between DNA sequences always cause some change in the cell’s biology, like turning on a gene. Instead, it seems as if some connections are always present.
Even though the overall DNA structure is largely the same across cell types, a handful of DNA connections are specific to certain types of cells. These differences usually involve genes that govern functions unique to that cell. For example, genes that are important in neurons have different DNA interactions in neurons than they do in other cells. In general, DNA sequencing is revealing that most cells have similar landscapes of DNA connections, but genes that are important for the functions of specific cell types can have varying connections (Figure 2).
3C technology leads to promising medical applications
The amount of information available to patients regarding genetic diseases has expanded rapidly since the first sequencing of the human genome in 2003. Today, parents are often faced with the difficult decision of whether to test their unborn child for these diseases. One drawback to prenatal testing procedures, such as amniocentesis, is that they are often invasive and can increase the risk of miscarriage, which makes the testing decision more difficult.
In 2017, however, researchers published a new method based on 3C called monogenic non-invasive prenatal diagnosis (MG-NIPD) that can test for specific genetic diseases using a simple blood draw from both biological parents. When pregnant, a woman’s blood contains some cell-free DNA from the fetus, but standard DNA sequencing techniques are not powerful enough to distinguish the small amount of fetal DNA from the mother’s DNA. The new MG-NIPD method harnesses advanced sequencing technologies to diagnose genetic diseases from very small amounts of fetal DNA. To achieve this, researchers crosslink and sequence the DNA in the blood samples to find which pieces of DNA belong to the two parents. They then identify the fetal DNA sequence by seeing which combinations are overrepresented in the maternal blood sample.
While MG-NIPD is far less invasive than an amniocentesis procedure and carries virtually no risks, it has only been tested in a few people and with only three genetic diseases. Whether this test is sufficient to replace amniocentesis remains to be seen, and additional clinical trials are needed to assess its sensitivity and accuracy. The use of MG-NIPD also requires access to genetic sequencing equipment, but the experiments should be inexpensive because only small regions of the genome need to be sequenced.
MG-NIPD is just one way to use 3C for therapeutic purposes, and additional applications will only become more advanced as we learn more about the 3D structure of DNA. Applications to help create targeted cancer treatments are among the most highly anticipated areas of research. 3C-based technologies could map the structures of DNA within cancer cells, which would provide information about how aggressive a patient’s cancer is and allow doctors to create personalized treatments.
Since its structure was discovered in 1953, we have learned how to manipulate DNA in cells, sequence it, and interpret some of that data. DNA sequencing is already revolutionizing biology and healthcare, and the pace of research has accelerated to such an extent that it’s hard to predict when or how it will affect our lives in the future. The 3D structure of DNA is now a fast-growing and fascinating area of research that will lead to advances we can’t imagine.
Gemma Johnson is a fourth-year Ph.D. student in Systems Biology at Harvard University.