Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093, Zurich, Switzerland.
Institute of Biophysical Chemistry, Goethe University Frankfurt, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany.
Nat Commun. 2022 Oct 18;13(1):6151. doi: 10.1038/s41467-022-33879-5.
Nuclear Magnetic Resonance (NMR) spectroscopy is a major technique in structural biology with over 11,800 protein structures deposited in the Protein Data Bank. NMR can elucidate structures and dynamics of small and medium size proteins in solution, living cells, and solids, but has been limited by the tedious data analysis process. It typically requires weeks or months of manual work of a trained expert to turn NMR measurements into a protein structure. Automation of this process is an open problem, formulated in the field over 30 years ago. We present a solution to this challenge that enables the completely automated analysis of protein NMR data within hours after completing the measurements. Using only NMR spectra and the protein sequence as input, our machine learning-based method, ARTINA, delivers signal positions, resonance assignments, and structures strictly without human intervention. Tested on a 100-protein benchmark comprising 1329 multidimensional NMR spectra, ARTINA demonstrated its ability to solve structures with 1.44 Å median RMSD to the PDB reference and to identify 91.36% correct NMR resonance assignments. ARTINA can be used by non-experts, reducing the effort for a protein assignment or structure determination by NMR essentially to the preparation of the sample and the spectra measurements.
核磁共振(NMR)光谱学是结构生物学的主要技术,已有超过 11800 个蛋白质结构被保存在蛋白质数据库中。NMR 可以阐明溶液、活细胞和固体中小和中等大小蛋白质的结构和动态,但受到繁琐的数据分析过程的限制。通常需要经过数周或数月的训练有素的专家的人工工作,才能将 NMR 测量结果转化为蛋白质结构。这个过程的自动化是一个悬而未决的问题,早在 30 多年前就已经提出。我们提出了一种解决方案,可以在完成测量后的数小时内实现蛋白质 NMR 数据的完全自动化分析。我们的基于机器学习的方法 ARTINA 仅使用 NMR 谱和蛋白质序列作为输入,无需人工干预即可提供信号位置、共振分配和结构。在包含 1329 个多维 NMR 谱的 100 个蛋白质基准测试中进行测试,ARTINA 证明了其能够以 1.44Å 的中位数 RMSD 解决结构问题,并能够识别 91.36%的正确 NMR 共振分配。ARTINA 可供非专家使用,实质上可以将 NMR 进行蛋白质分配或结构测定的工作量减少到样品制备和光谱测量。