I seem to do my best thinking on a train.
Thinking about the genomic hallmarks of host adaptation in bacteria. Genome size is a blunt measure between species. Psuedogenes are another within a species.
Thinking what else we could reliably use. ..
I seem to do my best thinking on a train.
Thinking about the genomic hallmarks of host adaptation in bacteria. Genome size is a blunt measure between species. Psuedogenes are another within a species.
Thinking what else we could reliably use. ..
Review of Population genomics of bacterial host adaptation from @[email protected]
https://www.nature.com/articles/s41576-018-0032-z
How to convert this into a braindead -> insert genome -> get generalist/ adapted prediction.... (Prospectively)
High-throughput sequencing technologies have enabled comparative analysis of large numbers of diverse bacterial genomes. Such studies are providing insights into the genomic changes that accompany changes in host specificity, with possible implications for controlling transmission of pathogenic bacteria.
@happykhan you might be interested in the approach Nicole Wheeler and I took to this a few years back: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1007333
Would be interesting to try to scale up to much larger collections of genomes...
Author summary Researchers are now collecting a wealth of genomic data from bacterial pathogens, and this will continue to grow with the introduction of routine sequencing for disease surveillance. However, our ability to use this data to predict how changes in genome sequence lead to differences in disease is limited. Here, we have used machine learning to detect an enrichment in functionally significant mutations in genes associated with a shift in pathogenic niche. This approach captures convergence in functional outcomes that does not necessarily result in a convergence in sequence, facilitating the inclusion of rare variants of large effect in an analysis, and allowing for complex interactions between genes. We apply this approach to Salmonella, showing that we can detect changes associated with disease phenotype in emerging lineages associated with the HIV epidemic. This approach should be applicable to other bacterial species with lineages independently adapting to similar niches. We provide open-source implementations of both the predictive model, and the workflow used to build it.