Zhou, Ye / Moscovich, Amit / Bartesaghi, Alberto. Data-driven determination of number of discrete conformations in single-particle cryo-EM. 2022. Computer Methods and Programs in Biomedicine. p. 106892
Background and objective: One of the strengths of single-particle cryo-EM compared to other structural determination techniques is its ability to image heterogeneous samples containing multiple molecular species, different oligomeric states or distinct conformations. This is achieved using routines for in-silico 3D classification that are now well established in the field and have successfully been used to charac- terize the structural heterogeneity of important biomolecules. These techniques, however, rely on expert- user knowledge and trial-and-error experimentation to determine the correct number of conformations, making it a labor intensive, subjective, and difficult to reproduce procedure. Methods: We propose an approach to address the problem of automatically determining the number of discrete conformations present in heterogeneous single-particle cryo-EM datasets. We do this by system- atically evaluating all possible partitions of the data and selecting the result that maximizes the average variance of similarities measured between particle images and the corresponding 3D reconstructions. Results: Using this strategy, we successfully analyzed datasets of heterogeneous protein complexes, in- cluding: 1) in-silico mixtures obtained by combining closely related antibody-bound HIV-1 Env trimers and other important membrane channels, and 2) naturally occurring mixtures from diverse and dynamic protein complexes representing varying degrees of structural heterogeneity and conformational plasticity. Conclusions: The availability of unsupervised strategies for 3D classification combined with existing ap- proaches for fully automatic pre-processing and 3D refinement, represents an important step towards converting single-particle cryo-EM into a high-throughput technique.