
Study reveals insights into protein evolution
Rice University's Peter Wolynes and his research team have unveiled a breakthrough in understanding how specific genetic sequences, known as pseudogenes, evolve. Their paper was published May 13 in the Proceedings of the National Academy of Sciences.
Phys.orgFascinating - I'll have to make time to dig into this one later.
"Many purported pseudogenes in bacterial genomes are bona-fide genes"
https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-024-10137-0
#microbiology #pseudogenes

Many purported pseudogenes in bacterial genomes are bona fide genes - BMC Genomics
Background Microbial genomes are largely comprised of protein coding sequences, yet some genomes contain many pseudogenes caused by frameshifts or internal stop codons. These pseudogenes are believed to result from gene degradation during evolution but could also be technical artifacts of genome sequencing or assembly. Results Using a combination of observational and experimental data, we show that many putative pseudogenes are attributable to errors that are incorporated into genomes during assembly. Within 126,564 publicly available genomes, we observed that nearly identical genomes often substantially differed in pseudogene counts. Causal inference implicated assembler, sequencing platform, and coverage as likely causative factors. Reassembly of genomes from raw reads confirmed that each variable affects the number of putative pseudogenes in an assembly. Furthermore, simulated sequencing reads corroborated our observations that the quality and quantity of raw data can significantly impact the number of pseudogenes in an assembler dependent fashion. The number of unexpected pseudogenes due to internal stops was highly correlated (R2 = 0.96) with average nucleotide identity to the ground truth genome, implying relative pseudogene counts can be used as a proxy for overall assembly correctness. Applying our method to assemblies in RefSeq resulted in rejection of 3.6% of assemblies due to significantly elevated pseudogene counts. Reassembly from real reads obtained from high coverage genomes showed considerable variability in spurious pseudogenes beyond that observed with simulated reads, reinforcing the finding that high coverage is necessary to mitigate assembly errors. Conclusions Collectively, these results demonstrate that many pseudogenes in microbial genome assemblies are actually genes. Our results suggest that high read coverage is required for correct assembly and indicate an inflated number of pseudogenes due to internal stops is indicative of poor overall assembly quality.
BioMed Central
5S Ribosomal DNA in the Family Plumbaginaceae - Cytology and Genetics
Abstract Tandemly arranged repetitive regions (repeats) that encode 5S rRNA (5S rDNA) are an indispensable component of eukaryotic genomes. Typically, 5S rDNA repeats within a genome are very similar due to the concerted nature of the evolution of this type of repeats. Each 5S rDNA repeat consists of an evolutionarily conserved coding sequence (CDS) and a variable intergenic spacer (IGS). 5S rDNA is a popular model for studying the molecular evolution of repetitive sequences, and the high rate of IGS mutations determines its wide use in phylogenetic analysis of closely related taxa. Nevertheless, 5S rDNA remains unexplored for many groups of higher plants, especially for the Plumbaginaceae family. Some taxa of this family are endemic to southern Ukraine and listed in the Red Book. However, their taxonomic status is controversial, and its clarification requires the use of molecular phylogenetic methods. In this work, we examined the molecular organization of 5S rDNA for representatives of four genera of the tribe Limonieae, the largest in the family Plumbaginaceae. It was shown that the CDS of 5S rDNA of representatives of the genera Limonium, Armeria, and Ceratolimon possess single mutations that do not affect the formation of the secondary structure of 5S rRNA. In contrast, in the genomes of Goniolimon species, in addition to functionally normal 5S rDNA repeats, numerous pseudogenes were found that do not evolve in a concerted manner and contain numerous mutations in the CDS that disrupt the secondary structure of 5S rRNA. A significant phylogenetic distance between representatives of the subgenera Pteroclados and Limonium of the genus Limonium indicates that Pteroclados can be considered a separate genus. The high rate of molecular evolution makes 5S rDNA IGS a convenient tool for the reconstruction of phylogenetic relationships within the studied genera of the tribe Limonieae and the barcoding of Ukrainian endemics of the genus Limonium.
SpringerLink