by Amir Bitran
figures by Daniel Utter

Proteins, the molecules that sustain all life, are similar to cars and other machines in an important way: they require a specific, well-defined structure to function. Obviously, a random pile of car parts cannot be driven. And similarly, a protein that is not assembled correctly cannot perform crucial tasks like producing energy, supporting cell structure, and generating electric signals. In fact, proteins that fail to properly “fold,” or assemble themselves into three-dimensional structures, may cause disease. Scientists have recently uncovered a new clue that helps to explain how some proteins acquire their functional structures within healthy cells: certain information in our genes, previously believed to be meaningless, may actually help proteins fold reliably in the cell. This finding could improve our understanding of both normal protein folding and how improper folding may be involved in Alzheimer’s disease, Parkinson’s disease, cystic fibrosis, and other ailments.

The Protein Folding Problem

Proteins are made up of chemical “building blocks,” known as amino acids, that are connected like beads on a string. Each amino acid has its own physical properties—for instance, some are large and bulky, while others have electric charge. The chain of amino acids that constitutes a protein must fold itself into a specific arrangement for the protein to function as intended. But just as there are many ways to bend or tie a string of yarn, a given protein can take on many different structures, only one of which functions properly. In fact, if a protein were to fold by randomly trying each arrangement and stopping when it found the correct one, the process would take far longer than the age of the universe (which is 14 billion years old)!

Since most proteins take less than a second to fold correctly, folding must not be random. Rather, protein folding is guided by different forces that push amino acids into their desired arrangement. For instance, if a protein contains amino acids with positive and negative charges, those amino acids will often be close to each other in the final structure (since opposite charges attract). But larger, more complex proteins can adopt multiple stable configurations (e.g., if there are multiple ways of bringing opposite charges together). Given this dilemma, could cells have evolved a strategy to help proteins reliably reach their correct structure over other stable (but dysfunctional) arrangements?

Synonymous Codons and Protein Folding

Ribosome-mediated protein synthesis
Figure 1: Genes encode instructions for building proteins. Sequences of DNA that encode a protein, known as genes, are copied into a molecule known as messenger RNA (mRNA). The information in mRNA is then read by a machine called the ribosome, which builds or “translates” a protein one amino acid at a time. Every triplet of “letters” in a gene (known as a codon) specifies one amino acid in the protein sequence—an “adaptor” molecule known as a transfer RNA helps convert between codons and amino acids.

The answer appears, at least partially, to involve a puzzling aspect of genes, the DNA sequences that dictate a protein’s amino acid building blocks (Figure 1). Within a gene, every three consecutive “letters” of DNA, known as a codon, specifies a particular amino acid. Certain changes in DNA, called mutations, cause disease by altering the corresponding protein’s constituent amino acids and destabilizing the protein’s functional structure. But other mutations do not change a protein’s amino acids; this is possible because most amino acids are encoded by multiple codons. These different encodings of the same message are analogous to synonyms in language (e.g., “happy” and “joyful”) and are called synonymous codons. Accordingly, mutations that simply replace one synonymous codon for another are called synonymous mutations.

As synonymous mutations in a protein-coding gene do not change a protein’s amino acids, they were initially believed to leave the protein’s structure and function intact. But various studies suggest otherwise: synonymous mutations can affect how the protein folds, thus affecting its function. These findings suggest that evolution favored particular codons that ensure proteins fold properly, rather than into other possible structures.

How can synonymous mutations affect protein folding? The answer involves a molecular factory called the ribosome that’s responsible for building proteins (Figure 1). The ribosome is a large, complex molecule that uses the information in genes to build proteins one amino acid at a time in a process known as synthesis. Certain synonymous mutations affect how easily the ribosome “reads” the information encoding a protein, which affects how quickly the protein is synthesized. Some proteins fold most effectively if their genes contain codons that the ribosome can read efficiently. If the gene contains inefficient synonymous codons, however, the ribosome may slow down or pause while trying to read them. This may cause the protein to fold into a stable but undesired and non-functional structure.

On the other hand, in some surprising cases, folding is optimized if the protein is built inefficiently, in bursts and pauses (Figure 2). Certain proteins can start folding correctly during a pause in synthesis even though they aren’t yet fully assembled. Such partially built proteins may, in fact, fold more accurately than fully synthesized proteins because their smaller sizes lead to fewer ways for it to fold incorrectly. As an analogy, a shorter string can be folded in fewer ways than a longer one, so there are fewer ways of making a mistake while tying it. In these cases, codons that are read inefficiently produce a beneficial pause that allows the partially built protein to fold correctly. Later, when synthesis is complete, the rest of the protein falls into place. A synonymous mutation could eliminate a beneficial pause and prompt incorrect folding.

Synonymous codons and protein folding
Figure 2: Inefficiently read synonymous codons may facilitate proper protein folding. Certain synonymous codons are read more efficiently by the ribosome than others. For some proteins, folding is optimized if the protein-coding sequence contains inefficiently read codons (case 1, top), as these produce a pause in synthesis, which gives a partially built protein time to start folding correctly. Replacing an inefficiently read codon with an efficient one (bottom) eliminates this pause, potentially leading to improper folding.

Outlook: Synonymous Codons, Evolution, and Disease

Various studies have linked synonymous mutations to changes in protein folding that are implicated in disease. A 2004 experiment showed that a synonymous mutation affects the folding and function of a protein that helps cancer cells resist chemotherapy. A follow-up clinical study suggests this mutation may affect both the efficacy and side effects of a particular breast-cancer drug. In addition, recent experiments have identified a synonymous mutation that eliminates a beneficial pause in the synthesis of a protein whose improper functioning causes cystic fibrosis. This mutation leads the protein to fold incorrectly, potentially worsening cystic fibrosis symptoms. Most recently, a 2017 study found a strong correlation between the extent to which synonymous mutations affect the efficiency of protein synthesis (and likely protein folding) and the severity of 22 different diseases.

Together, these results suggest that evolution favors codons that best help proteins fold correctly. This finding sheds new light on the protein folding problem—how proteins faithfully attain the structures that are required for them to perform their diverse, life-sustaining tasks—and promises to improve our understanding of evolution and disease. Future research is bound to unveil new ways in which these mutations affect both healthy and sick organisms alike.

Amir Bitran is a Ph.D. student in the Harvard’s biophysics program. He is interested in understanding how proteins evolve to optimally fold into their functional structures. 

For more information:

An accessible article that describes the importance of protein folding and the role of improper folding in disease.

A comprehensive and in-depth review detailing the various effects of synonymous mutations, including on protein folding

An early set of experiments which shows that synonymous codons affect the folding and function of a protein involved in cancer resistance to chemotherapy.

Cover image credit: ‘Molecule display’ by allispossible.org.uk [CC BY 2.0]

One thought on “The Hidden Genetic Code

  1. Hi Amir,

    I really enjoyed reading yout article (along with the well-done illustrations). You are taking me back to the years when I studied biochemistry. But, above all, I liked how you write, with great clarity and good examples that clarify what is a complex topic. Congratulations.

    I would like to know how your research proceeds. When you have a new article or paper, do let me know.


    Un abrazo, Georg

Leave a Reply

Your email address will not be published. Required fields are marked *