1/19
If the protein or complex was purified from its native host organism:
1. I sequence the map with `model_angelo build_no_seq`.
2. I identify all proteins with `model_angelo hmm_search` against the reference proteome of the host organism (downloaded from #Uniprot).
If the protein or complex was prepared recombinantly, I already know the sequences of all proteins.
So at this stage, either way, I know which proteins are in there.
3/19
Once I'm happy, or I abandoned modeling the last remaining bits of unexplained density or fixing the last subtle model errors in places mostly irrelevant to the biological question the model is meant to answer, then I run `phenix.validation_cryoem` on the half-maps and atomic model. This gives me most of the numbers to put in the refinement table. And I deposit the model into the #PDB and maps into the #EMDB. Fin.
Happy to hear any comments or descriptions of how other people do this!
18/19
19/19
Links to the resources and programs mentioned in this thread:
https://github.com/3dem/model-angelo
https://www.cgl.ucsf.edu/chimerax/
https://github.com/tristanic/isolde
https://github.com/rsanchezgarc/deepEMhancer
https://phenix-online.org/documentation/reference/real_space_refine.html
https://github.com/keitaroyam/servalcat/
https://phenix-online.org/documentation/reference/validation_cryo_em.html
An addition to this ๐ from spring 2025: I now also use EMReady to produce a post-processed map. It follows a different formalism than deepEMhancer, and I have observed a case where they disagreed on the location of the main chain over a loop of ~4 residues. This was helpful to identify a mixture of states and model them properly.
EMReady paper: https://doi.org/10.1038/s41467-023-39031-1
EMReady website: http://huanglab.phys.hust.edu.cn/EMReady/
Regarding water molecules, I recently added phenix.douse to my toolbox. It is very easy to use with the integration between ChimeraX and Phenix, and in my (still limited) experience produces good results. I run it when the model is reasonably complete, to limit false positives (water molecules placed in density not yet modeled but clearly something else).