Eukaryotic genomes show strong evolutionary conservation of k-mer composition and correlation contributions between introns and intergenic regions

Several strongly conserved DNA sequence patterns in and between introns and intergenic regions (IIRs) consisting of short tandem repeats (STRs) with repeat lengths <3 bp have already been described in the kingdom of Animalia. In this work, we expanded the search and analysis of conserved DNA sequ...

Full description

Saved in:
Bibliographic Details
Main Authors: Sievers, Aaron (Author) , Sauer, Liane (Author) , Hausmann, Michael (Author) , Hildenbrand, Georg Lars (Author)
Format: Article (Journal)
Language:English
Published: 1 October 2021
In: Genes
Year: 2021, Volume: 12, Issue: 10, Pages: 1-20
ISSN:2073-4425
DOI:10.3390/genes12101571
Online Access:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.3390/genes12101571
Verlag, lizenzpflichtig, Volltext: https://www.mdpi.com/2073-4425/12/10/1571
Get full text
Author Notes:Aaron Sievers, Liane Sauer, Michael Hausmann and Georg Hildenbrand
Description
Summary:Several strongly conserved DNA sequence patterns in and between introns and intergenic regions (IIRs) consisting of short tandem repeats (STRs) with repeat lengths <3 bp have already been described in the kingdom of Animalia. In this work, we expanded the search and analysis of conserved DNA sequence patterns to a wider range of eukaryotic genomes. Our aims were to confirm the conservation of these patterns, to support the hypothesis on their functional constraints and/or the identification of unknown patterns. We pairwise compared genomic DNA sequences of genes, exons, CDS, introns and intergenic regions of 34 Embryophyta (land plants), 30 Protista and 29 Fungi using established k-mer-based (alignment-free) comparison methods. Additionally, the results were compared with values derived for Animalia in former studies. We confirmed strong correlations between the sequence structures of IIRs spanning over the entire domain of Eukaryotes. We found that the high correlations within introns, intergenic regions and between the two are a result of conserved abundancies of STRs with repeat units ≤2 bp (e.g., (AT)n). For some sequence patterns and their inverse complementary sequences, we found a violation of equal distribution on complementary DNA strands in a subset of genomes. Looking at mismatches within the identified STR patterns, we found specific preferences for certain nucleotides stable over all four phylogenetic kingdoms. We conclude that all of these conserved patterns between IIRs indicate a shared function of these sequence structures related to STRs.
Item Description:Gesehen am 01.12.2021
Physical Description:Online Resource
ISSN:2073-4425
DOI:10.3390/genes12101571