2025Fu T2Relion

From 3DEM-Methods
Jump to navigation Jump to search

Citation

Fu, J., Xu, J., Gan, L., Mao, T., Shen, Z., Wang, Y., Song, Z., Duan, X., Xue, W. and Yang, G. 2025. T2-RELION: Task Parallelism, Tensor Core Accelerated RELION for Cryo-EM 3D Reconstruction. Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2025), 2186–2202.

Abstract

Cryo-electron microscopy (cryo-EM) is a key technique for structural biology, but its computational efficiency, particularly during 3D reconstruction, remains a bottleneck.We introduce T2-RELION, a highly optimized version of RELION for cryo-EM 3D reconstruction on CPU-GPU platforms. RELION is a widely used open-source package in the cryo-EM community. We identify and resolve key inefficiencies in RELION’s parallelization strategy and memory management by proposing task parallelism and a three-phase GPU memory management strategy. Furthermore, we leverage Tensor Cores to accelerate the hot-spot kernel for difference calculation, employing an advanced pipelining strategy to hide latency and enable thread-block-level data reuse. On a quad-A100 GPU machine, performance evaluations demonstrate that T2-RELION outperforms RELION 4.0. For the hot-spot kernel, our optimizations achieve 1.90- 23.7 times speedup. For the whole application using CNG and Trpv1 datasets, we observe 3.86 times and 2.68 times speedups, respectively.

Keywords

https://dl.acm.org/doi/full/10.1145/3712285.3759824

Comments