Transposable Elements – Genes to Genomes https://genestogenomes.org A blog from the Genetics Society of America Fri, 28 Oct 2022 14:26:41 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 https://genestogenomes.org/wp-content/uploads/2023/06/cropped-G2G_favicon-32x32.png Transposable Elements – Genes to Genomes https://genestogenomes.org 32 32 Lineage specific retrotransposons shaped the genome evolution of domesticated rice https://genestogenomes.org/lineage-specific-retrotransposons-shaped-the-genome-evolution-of-domesticated-rice/ Wed, 09 Aug 2017 21:35:01 +0000 https://genestogenomes.org/?p=9833 Rice is one of the most important food crops on earth. Like many other plants, the genome of this critical global species is dominated by transposable elements—selfish genes that multiply themselves to the detriment of their host. In the June issue of G3, Zhang and Gao analyze the genomic long terminal repeat (LTR) retrotransposon content…]]>

Rice is one of the most important food crops on earth. Like many other plants, the genome of this critical global species is dominated by transposable elements—selfish genes that multiply themselves to the detriment of their host. In the June issue of G3, Zhang and Gao analyze the genomic long terminal repeat (LTR) retrotransposon content of the two species of cultivated rice and their six closely related wild relatives. They find that these LTRs have multiplied in a lineage specific manner and suggest that the unique activity of retrotransposons in these rice species has contributed to their diversification and isolation.

The eight rice species included in this study present an excellent opportunity to investigate the dynamics of LTR retrotransposons during speciation. The common ancestor of this group lived only 5 million years ago, and each member has whole genome sequence data available. For each species, Zhang and Gao extracted the LTR retrotransposon sequences from each genome and sorted them into 790 families based on homology. For very highly similar LTRs, they were able to estimate genome abundance from the sequencing read count. They then used homology between LTR retrotransposon families to estimate the time of their origin, placed in the phylogenetic context of the rice species group.

Overall, they found that the LTR retrotransposon content of the genomes varied greatly among different rice species and that genome size varied due to repeat content. However, almost all of the families with very high copy numbers were found in all of the species, indicating they were present in the most recent common ancestor. It is possible that even the newer LTR retrotransposon families were present in the ancestor, but as these sequences are very fast evolving, they may have diverged past the point of recognition. Extremely highly amplified LTR retrotransposon families also generally have much shorter periods of activity than families with fewer overall sequences.

There is clear evidence that lineage specific LTR retrotransposon activity has shaped the genomes of these groups. Certain families of transposable elements have high copy numbers in only one or a few closely related species, indicating that bursts of retrotransposon activity occurred after the split between these groups. Notably, there is also a difference in LTR retrotransposon content and activity between domesticated African rice and its wild progenitor, which split about 260,000 years ago. Such differences between lineages may indicate that LTR retrotransposon activity is associated with changes in environment or life history. Certainly, they have helped rapidly shape distinct genomes in these diverging species.

CITATION:

Rapid and Recent Evolution of LTR Retrotransposons Drives Rice Genome Evolution During the Speciation of AA-Genome Oryza Species

Qun-Jie Zhang and Li-Zhi Gao

G3: Genes, Genomes, Genetics

https://doi.org/10.1534/g3.116.037572

http://www.g3journal.org/content/7/6/1875

]]>
The fungus-fighting secrets hiding in the sugar pine’s enormous megagenome https://genestogenomes.org/the-fungus-fighting-secrets-hiding-in-the-sugar-pines-enormous-megagenome/ Wed, 04 Jan 2017 13:00:08 +0000 https://genestogenomes.org/?p=8067 Towering sugar pine trees dominate the mountain forests of California and Oregon. They are the tallest pine trees in the world, regularly growing to skyscraper heights of over 100 meters. But these forest behemoths are under attack from a very tiny foe: an invasive fungus. White pine blister rust was accidentally introduced to western North…]]>

Towering sugar pine trees dominate the mountain forests of California and Oregon. They are the tallest pine trees in the world, regularly growing to skyscraper heights of over 100 meters. But these forest behemoths are under attack from a very tiny foe: an invasive fungus. White pine blister rust was accidentally introduced to western North America nearly a century ago. Since then, blister rust infections have been threatening the survival and reproduction of sugar pines, harming the ecosystem and industries that depend on them. Conservation efforts have shown that genetic variation contributes to the likelihood that one tree and not another succumbs to infection, but efforts to track down the genes involved have been complicated by the staggeringly huge genome of this giant tree and the arduous tests. The sugar pine genome is ten times the size of the human genome—a whopping 31 billion base pairs. Kristian Stevens and colleagues announced the complete sequence of the sugar pine genome in the December issue of GENETICS, the largest genome fully sequenced to date. Their work, along with a companion paper on the sugar pine transcriptome published in G3, highlights the evolutionary implications of such a massive genome size, as well as revealing candidate genes for blister rust resistance and a promising path to efficient selection of resistant individuals.

Despite its enormous size, the sugar pine genome contains about the same number of protein coding genes as the human genome. No less than 79% of the DNA in the sugar pine genome is made up of transposable elements, which accounts for its enormous size. These genetic parasites are stretches of DNA that exist only to proliferate within a genome. Rather than contributing to the sugar pine’s phenotype, they encode machinery that lets them make copies of themselves at new sites in the genome. Transposable elements are common in all eukaryotic genomes, but in conifers, and especially the sugar pine, they have multiplied to enormous numbers. In the sugar pine genome, the transposable elements are mostly non-functional relics. These genomic leftovers can tell researchers about the evolutionary history of the sugar pine and also provide insights about how genomes size evolves. They also create substantial problems for researchers trying to work with the sugar pine genome.

Transposable elements are highly repetitive, and when they are present in numbers as large as in the sugar pine, they are extremely difficult to sequence. Whole genome sequencing generally works by breaking a genome up into extremely small pieces and then putting them back together one by one. Repetitive genetic sequences make this process incredibly difficult because when the pieces are assembled, all the repeats look the same and end up incorrectly merged into one sequence. To get around this problem, the researchers assembling the sugar pine genome used several strategies. They obtained most of the sequence data from a single haploid pine nut, avoiding the typical complications of sequencing two parental genomes in a diploid individual.They sequenced the transcriptome to identify those sequences that produce proteins, and then used those sequences to assemble the corresponding genes. They also used sequencing libraries specially prepared with the reads known to be large distances away from one another, which is useful in linking larger genomic structures—the big picture. These techniques, along with others, allowed the researchers to build a useful working draft of the massive sugar pine genome.

A twig infected with white pine blister rust. Photo by <a href="https://commons.wikimedia.org/wiki/File:Cronartium_ribicola_on_Pinus_strobus_abrimaal2013.jpg">Marek Argent via Wikimedia</a>.

A twig infected with white pine blister rust. Photo by Marek Argent via Wikimedia.

Sequencing an entire genome, especially one as large as the sugar pine, is an impressive technological achievement. More importantly, however, it is an incredibly powerful research tool in the fight against white pine blister rust. This fungus has been infecting multiple species of white pines in the North America since it was first introduced from Asia around the turn of the century. White pine blister rust is a slow killer, taking years to destroy a large tree. An infection begins when fungal spores land on the surface of the tree and begin to germinate. They grow through openings into the twigs and branches, and very slowly make their way towards the main trunk of the tree. The infected branches swell up and large sacks of rusty orange-red spores burst through the branches. The fungal infection causes cankers, which prevents the tree from sending water and nutrients to its damaged limbs. Eventually, these limbs will die. If cankers form on the main trunk, the entire tree may die.

Researchers and forest managers have been looking for a way to fight the spread of white pine blister rust for a long time. Some rare sugar pines carry genetic resistance to white pine blister rust, and have been used in reforestation efforts. In the 1970s, these rare individuals were used to identify a major locus of resistance called Cr1, but the daunting size of the sugar pine genome made further analysis difficult. Using this new genome sequence, Stevens and colleagues were able to make a breakthrough in identifying this gene. They used the small amount of genetic information already known to find large Cr1-associated segments and identify previously unknown SNPs that are closely associated with resistance. These markers are a powerful tool that can be used to quickly and cheapy identify trees that carry the resistant allele without waiting for the results of slow and expensive infection assays. Resistant trees can then be harvested for seeds to be used in reforestation. Now armed with a roadmap, scientists can search the sugar pine genome for the secrets that may help save these iconic trees and the ecosystems that depend on them.

 

Stevens, K. A., Wegrzyn, J. L., Zimin, A., Puiu, D., Crepeau, M., Cardeno, C., Paul, R., Gonzalez, D., Koriabine, M., Holtz-Morris., A. E., Martínez-García, P. J., Sezen, U.U., Marçais, G., Jermstad, K., McGuire, P. E., Loopstra, C. A., Davis, J. M., Eckert, A., deJong, P., Salzberg, S. L., Neale, & Langley, C. H. (2016). Sequence of the Sugar Pine Megagenome. Genetics, 204(4), 1613-1626. DOI:

http://www.genetics.org/content/204/4/1613.abstract

 

Gonzalez-Ibeas, D., Martinez-Garcia, P. J., Famula, R. A., Delfino-Mix, A., Stevens, K. A., Loopstra, C. A., Langley, C. H., Neale, D. B., & Wegrzyn, J. L. (2016). Assessing the gene content of the megagenome: sugar pine (Pinus lambertiana). G3: Genes| Genomes| Genetics, 6(12), 3787-3802. DOI:

http://www.g3journal.org/content/6/12/3787.short

]]>
GSA member Zhao Zhang receives NIH Director’s Early Independence Award https://genestogenomes.org/gsa-member-zhao-zhang-receives-nih-directors-early-independence-award/ https://genestogenomes.org/gsa-member-zhao-zhang-receives-nih-directors-early-independence-award/#comments Thu, 08 Oct 2015 11:54:07 +0000 https://genestogenomes.org/?p=2952 GSA member Zhao Zhang was named as one of 16 recipients of the NIH Director’s Early Independence Award for 2015, joining Jason Sheltzer. Established in 2011, the Early Independence Awards program provides an opportunity for exceptional junior scientists who have recently received their doctoral degree or finished medical residency to skip traditional post-doctoral training and move immediately into independent research…]]>

GSA member Zhao Zhang was named as one of 16 recipients of the NIH Director’s Early Independence Award for 2015, joining Jason Sheltzer. Established in 2011, the Early Independence Awards program provides an opportunity for exceptional junior scientists who have recently received their doctoral degree or finished medical residency to skip traditional post-doctoral training and move immediately into independent research positions.

This has been a good year for Zhang, who was honored with the Larry Sandler Memorial Award from the Drosophila community and GSA, which is given annually to honor an outstanding PhD dissertation in research using the fruit fly Drosophila.

 

Zhao Zhang

Zhao Zhang (courtesy NIH)

Zhao Zhang, PhD
Staff Associate
Department of Embryology
Carnegie Institution for Science
2015 recipient, Larry Sandler Memorial Award

 

NIH reported Zhang’s background as follows:

Project Title: Somatic Transposition-Mediated Genome Variegation during Development, Disease and Aging Conditions

Zhao Zhang established his own research lab at Carnegie Institution for Science, Department of Embryology in 2014, after receiving his Ph.D. at University of Massachusetts Medical School under the guidance of Phillip Zamore and William Theurkauf. Since graduate school, he has been fascinated by the most abundant element in our genome, transposons, also known as jumping genes. During his Ph.D., he studied the mechanisms that suppress transposons in animal germ cells to maintain animal fertility. Now his lab is building tools to understand how transposons are controlled in somatic tissues, particularly under aging and disease conditions, such as cancer.

]]>
https://genestogenomes.org/gsa-member-zhao-zhang-receives-nih-directors-early-independence-award/feed/ 3
Zhao Zhang receives Larry Sandler Memorial Award for outstanding PhD in Drosophila research https://genestogenomes.org/zhao-zhang-receives-larry-sandler-memorial-award-for-outstanding-phd-in-drosophila-research/ https://genestogenomes.org/zhao-zhang-receives-larry-sandler-memorial-award-for-outstanding-phd-in-drosophila-research/#comments Fri, 13 Mar 2015 12:00:46 +0000 https://genestogenomes.org/?p=2960 This year’s Larry Sandler Memorial Award for an outstanding PhD dissertation in Drosophila research was presented to Zhao Zhang. Zhang, pictured receiving the prestigious award from Erika Bach (New York University), delivered the award lecture on the opening night of last week’s 56th Annual Drosophila Research Conference in Chicago, IL, organized by GSA. He carried…]]>

This year’s Larry Sandler Memorial Award for an outstanding PhD dissertation in Drosophila research was presented to Zhao Zhang. Zhang, pictured receiving the prestigious award from Erika Bach (New York University), delivered the award lecture on the opening night of last week’s 56th Annual Drosophila Research Conference in Chicago, IL, organized by GSA. He carried out the award-winning doctoral work at the University of Massachusetts Medical School, and is now a Junior Investigator as well as the newest staff member at the Carnegie Department of Embryology.

 

Zhao Zhang receives the 2015 Larry Sandler Memorial Award from Erika Bach

Zhao Zhang receives the 2015 Larry Sandler Memorial Award from Erika Bach, 56th Annual Drosophila Research Conference.

 

The Larry Sandler Memorial award is given annually to honor an outstanding PhD dissertation in research using the fruit fly Drosophila. This powerful model organism is employed in many areas of research including genetics, disease, evolution, neurology, and more. The Larry Sandler Memorial Award was established in recognition of Dr. Larry Sandler’s many contributions to Drosophila genetics and his dedication to the training of Drosophila biologists.

“We congratulate Zhao on this exceptional honor,” said Allan Spradling, Director of Carnegie’s Department of Embryology and keynote speaker at last week’s conference. “He is exactly the sort of original, unconventional, and self-motivated researcher that Carnegie seeks to support. We look forward to his many accomplishments that lie ahead.”

Zhang delivered a stimulating award lecture describing his studies of transposons, DNA elements with the ability to “jump” around the genome. His doctoral research investigated how transposons are regulated in germ cells (eggs and sperm), with the goal of understanding how transposons contribute to genomic instability and to mutations that lead to inherited disease and cancer. In particular, his research has focused on the interplay between small RNA molecules known as piRNAs (Piwi-interacting RNAs), their recognition by the cell, and transposon silencing. His work has revealed novel insights into how piRNAs are processed by the cell and into the functional consequences of transposon activation and silencing. In addition to these discoveries, Zhang is recognized for his ability to integrate numerous cutting-edge and traditional technologies, while also developing novel ones.

Zhang received a B.S. in biotechnology from Shandong Agriculture University in Tai-an, China, and an M.S. in cell biology at Beijing Normal University. He carried out his doctoral work in the laboratories of Bill Theurkauf and Phil Zamore, and received his PhD in November 2013 from the University of Massachusetts Medical School in interdisciplinary studies.

 

Sources and further information:

 

]]>
https://genestogenomes.org/zhao-zhang-receives-larry-sandler-memorial-award-for-outstanding-phd-in-drosophila-research/feed/ 1
Old Transposable Elements, New Tricks https://genestogenomes.org/old-transposable-elements-new-tricks/ Tue, 04 Feb 2014 00:53:59 +0000 https://genestogenomes.org/?p=527 Transposable elements don’t proliferate in genomes at a steady pace; they often arrive in bursts. But models of neutral TE evolution assume transposition occurs at a constant rate. That makes it harder to test, for instance, whether low TE allele frequencies in a population are due to negative selection or just a recent transposition burst.…]]>

Transposable elements don’t proliferate in genomes at a steady pace; they often arrive in bursts. But models of neutral TE evolution assume transposition occurs at a constant rate. That makes it harder to test, for instance, whether low TE allele frequencies in a population are due to negative selection or just a recent transposition burst. In the February issue of GENETICS, Blumenstiel et al. describe a test for neutrality that doesn’t make this assumption. They applied their approach to retrotransposon insertion data from North American and African populations of Drosophila melanogaster and found that age alone explained more than 80% of the variance in allele frequencies. The new framework could also be useful for analyzing selection on other types of large insertion mutations, like gene duplications.

Read the full article.

 

Blumenstiel J.P., M. He & C. M. Bergman (2013). An Age-of-Allele Test of Neutrality for Transposable Element Insertions, Genetics, 196 (2) 523-538. DOI: 10.1534/genetics.113.158147 http://www.genetics.org/content/196/2/523

]]>
Assembling a Colossus https://genestogenomes.org/assembling-a-colossus-2/ Thu, 16 Jan 2014 10:30:25 +0000 http://127.0.0.1:8080/wordpress/?p=36 The loblolly pine genome is big. Bloated with retrotransposons and other repetitive sequences, it is seven times larger than the human genome and easily big enough to overwhelm standard genome assembly methods. This forced the loblolly pine genome sequencing team, led by David Neale at the University of California, Davis, to look for ways to…]]>

The loblolly pine genome is big. Bloated with retrotransposons and other repetitive sequences, it is seven times larger than the human genome and easily big enough to overwhelm standard genome assembly methods.

This forced the loblolly pine genome sequencing team, led by David Neale at the University of California, Davis, to look for ways to reduce the enormous complexity of their task.
The draft genome sequence, described in the latest issue of GENETICS and the journal Genome Biology, was pieced together from over 16 billion sequence reads. Spanning around 23 billion base pairs, it only just beats out the Norway spruce as the largest genome ever sequenced, but it is substantially more complete. For example, the N50 scaffold size of the current loblolly assembly is 66.9 Kbp, compared to 0.72 Kbp in the Norway spruce.

So how did they do it?

One strategy was to generate most of the sequence from part of a single pine nut. This tiny source material was the megagametophyte, which is the haploid tissue that provides nutrients to the developing diploid embryo. Despite the limited amount of DNA that can be extracted from this source, the reduced complexity of a haploid genome makes it easier to assemble. To link up all the sequence fragments from the haploid genome, the team also created DNA libraries from diploid needles of the parent genotype.

But this still left the assembly team, led by Steven Salzberg at Johns Hopkins University and James Yorke at the University of Maryland, with more data than their computational methods could handle.

The solution was a method of pre-processing the data into “super reads”, or larger chunks of contiguous haploid sequence that condensed many individual reads. In essence, they were dealing with the unambiguous parts of the problem first, and getting rid a huge amount of overlapping and redundant data in the process.

The result was a 100-fold reduction in the amount of megagametophyte sequence that needed to be held in the memory of the assembly computer. That kind of reduction is not just handy for giant genomes; Salzberg says it also speeds up projects of more modest scale.

Luckily, says Salzberg, the loblolly genome project wasn’t held back by the masses of repeats that are typical of conifers. Even though around 82% of the loblolly pine genome is repetitive, it turns out that most of the repeats are evolutionarily ancient. That means they have diverged enough to no longer be a big stumbling block for assembly.

All this is good news for sequencing other conifer species, especially since the team is already tackling an even larger behemoth: the 35 gigabase genome of the sugar pine.

Check out the loblolly genome articles and other highlights of this month’s GENETICS.


Zimin A., Stevens K.A., Crepeau M.W., Holtz-Morris A., Koriabine M., Marcais G., Puiu D., Roberts M., Wegrzyn J.L. & de Jong P.J. & (2014). Sequencing and Assembly of the 22-Gb Loblolly Pine Genome, Genetics, 196 (3) 875-890. DOI:

Wegrzyn J.L., Liechty J.D., Stevens K.A., Wu L.S., Loopstra C.A., Vasquez-Gross H.A., Dougherty W.M., Lin B.Y., Zieve J.J. & Martinez-Garcia P.J. & (2014). Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation, Genetics, 196 (3) 891-909. DOI:

Neale D.B., Wegrzyn J.L., Stevens K.A., Zimin A.V., Puiu D., Crepeau M.W., Cardeno C., Koriabine M., Holtz-Morris A.E. & Liechty J.D. & (2014). Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies, Genome Biology, 15 (3) R59. DOI:

]]>