| Google Scholar | https://scholar.google.com/citations?user=yfARKJgAAAAJ&hl=en |
| Google Scholar | https://scholar.google.com/citations?user=yfARKJgAAAAJ&hl=en |
Interestingly, models #1 and #2 did just as well or better as the neural networks as a predicting whether a particular sequence will generate a nanobody that will be a "sticky" binder to the PSR (seen by the high area ROC curves below).
This is nice, because models #1 and #2 generate rules that are fairly straightforward for a human to interpret - i.e., such a such a motif of 3 sequential amino acids at positions XXX, is a large contributor to nanobody "stickiness".
Of course, the previous studies attempt to correct for this, but Mashaal's paper, and the accompanying manuscript, find that it didn't do as well as hoped.
How they showed this is they used a much less genetically diverse sample - the UK Biobank ("UKB" below), which contains genomes of 300,000 white British people - and found a lot of the signal of adaptation found in a similar dataset of a more genetically diverse population ("GIANT") disappeared.