Abstract
Repetitive elements (RE) and transposons (TE) can comprise up to 80% of some plant genomes and may be essential for regulating their evolution and adaptation. The “repeatome” information is often unavailable in assembled genomes because genomic areas of repeats are challenging to assemble and are often missing from final assembly. However, raw genomic sequencing data contain rich information about RE/TEs. Here, raw genomic NGS reads of 10 gymnosperm species were studied for the content and abundance patterns of their “repeatome”. We utilized a combi-nation of alignment on databases of repetitive elements and de novo assembly of highly repetitive sequences from genomic sequencing reads to characterize and calculate the abundance of known and putative repetitive elements in the genomes of 10 conifer plants: Pinus taeda, Pinus sylvestris, Pinus sibirica, Picea glauca, Picea abies, Abies sibirica, Larix sibirica, Juniperus communis, Taxus baccata, and Gnetum gnemon. We found that genome abundances of known and newly discovered putative repeats are specific to phylogenetically close groups of species and match biological taxa. The grouping of species based on abundances of known repeats closely matches the grouping based on abundances of newly discovered putative repeats (kChains) and matches the known taxonomic relations.
Original language | English |
---|---|
Article number | 1234 |
Journal | Life |
Volume | 11 |
Issue number | 11 |
DOIs | |
State | Published - 15 Nov 2021 |
Bibliographical note
Publisher Copyright:© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
Keywords
- Gymnosperms
- Principal component analysis
- Repetitive elements
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- General Biochemistry, Genetics and Molecular Biology
- Space and Planetary Science
- Paleontology