Cryo-EM is a rapidly growing field and with the advancements in both hardware and software, we can achieve near-atomic reconstructions. There are several steps for data processing and reconstruction in a cryo-EM pipeline and a number of these steps require manual intervention. With a massive growth in the wealth of cryo-EM data being collected, more automatic methods for data processing will help in streamlining the process. Although it's possible to achieve high resolution details using cryo-EM, still majority of the available data needs better methods in order to interpret finer details.
SciML has various projects and collaborations in order to explore the applicability of machine learning models during various steps in the data processing pipeline and interpretation. There are several ongoing projects to achieve this in various processing steps such as particle picking, 2D classification, denoising, segmentation, validation. These projects are very actively being pursued by SciML team members in collaboration with eBIC and CCP-EM.
Specifically, projects aim to develop machine learning-based algorithms for better and faster interpretation of data, collating benchmark datasets for testing and enabling method development in the cryo-EM field.
Figure shows the trimer of the SARS-CoV-2 Spike protein trimer (green,pink and red ribbon) determined using cryo-EM (EMDB: 21452 and PDB: 6VXX).