Advancements In Genomic Research Reveal Alternative Transcription Initiation Sites In Thousands Of Soybean Genes
LINDSEY BEREBITSKY
WEST LAFAYETTE, INDIANA
Rosalind Franklin, James Watson and Francis Crick discovered the structure of DNA – that molecular blueprint for life – over 70 years ago. Today, scientists are still uncovering new ways to read it.
In 2010, Jianxin Ma, a professor of agronomy, and his collaborators built the first reference genome for soybeans on the widely studied Williams 82 variety. Thousands of scientists and plant breeders have since used that genome in their own research on the genetic makeup underlying various characteristics, such as seed protein and oil content, plant architecture and productivity, and disease resistance and abiotic stress tolerance in soybeans.
Through the last decade, Ma, who is the Indiana Soybean Alliance Inc. Endowed Chair in Soybean Improvement, has been recognized internationally for his contribution to the soybean genome as well as for his continued research and innovation in the field. His most recent work, published in The Plant Cell, used advancements in genomic research to fill in gaps of the original soybean reference genome.
“The reference genome was like a dictionary when we announced it,” Ma said. “Each gene was like a single word. However, there was a piece of critical information lacking: transcription initiation sites for individual genes.”
Transcription initiation sites are locations in the DNA where a specialized transcription-factor protein can attach and then build an mRNA copy of the gene in front of it. That mRNA is read and translated at a cell’s ribosome to create more proteins, important for the chemical and physical function of every organism.
Knowing where the mRNA begins formation on the DNA strand is a significant part of understanding how genes are expressed. These initiation sites contain regulatory elements and provide information to the cell about when and where to transcribe each gene to make protein, and how frequently to do so at any point in time.
In genetics, it has generally been accepted that each gene has one transcription initiation site, located downstream of a core promoter region and typically around a TATA box – a DNA sequence rich in thymine and adenine repeats. But Ma and his colleagues no longer think this is the case.
“There is a set of predicted transcription start sites for over 50,000 genes in soy, but based on our new study, less than 3% of those predicted transcription initiation sites actually are correct,” Ma said.
In 2020, the development of the Survey of TRanscription Initiation at Promoter Elements Sequencing (STRIPE-seq) technique offered Ma’s lab an effective, efficient, faster and more affordable way to identify transcription initiation sites across the entire soybean genome. It also provided information about the relative abundance of every mRNA copy, which gives clues as to how much a gene is expressed in different tissues and times.
With funding from the United States Department of Agriculture’s National Institute of Food and Agriculture (USDA-NIFA) and the National Science Foundation, Ma and his lab performed STRIPE-seq analyses on eight different tissues in soybean: leaves, stems, stem tips, roots, nodules, flowers, pods and developing seeds. Even though the plant’s DNA is consistent across these tissues, the expression of genes differs.
LINDSEY BEREBITSKY
WEST LAFAYETTE, INDIANA