Suppr超能文献

晶体former:通过帕特森图、深度学习和部分结构注意力进行蛋白质结构测定。

CrysFormer: Protein structure determination via Patterson maps, deep learning, and partial structure attention.

作者信息

Pan Tom, Dun Chen, Jin Shikai, Miller Mitchell D, Kyrillidis Anastasios, Phillips George N

机构信息

Department of Computer Science, Rice University, Houston, Texas 77005, USA.

Department of BioSciences, Rice University, Houston, Texas 77005, USA.

出版信息

Struct Dyn. 2024 Aug 14;11(4):044701. doi: 10.1063/4.0000252. eCollection 2024 Jul.

Abstract

Determining the atomic-level structure of a protein has been a decades-long challenge. However, recent advances in transformers and related neural network architectures have enabled researchers to significantly improve solutions to this problem. These methods use large datasets of sequence information and corresponding known protein template structures, if available. Yet, such methods only focus on sequence information. Other available prior knowledge could also be utilized, such as constructs derived from x-ray crystallography experiments and the known structures of the most common conformations of amino acid residues, which we refer to as partial structures. To the best of our knowledge, we propose the first transformer-based model that directly utilizes experimental protein crystallographic data and partial structure information to calculate electron density maps of proteins. In particular, we use Patterson maps, which can be directly obtained from x-ray crystallography experimental data, thus bypassing the well-known crystallographic phase problem. We demonstrate that our method, CrysFormer, achieves precise predictions on two synthetic datasets of peptide fragments in crystalline forms, one with two residues per unit cell and the other with fifteen. These predictions can then be used to generate accurate atomic models using established crystallographic refinement programs.

摘要

确定蛋白质的原子级结构是一项长达数十年的挑战。然而,变压器和相关神经网络架构的最新进展使研究人员能够显著改进这个问题的解决方案。这些方法使用序列信息的大型数据集以及相应的已知蛋白质模板结构(如果可用)。然而,此类方法仅关注序列信息。其他可用的先验知识也可以被利用,例如源自X射线晶体学实验的结构以及氨基酸残基最常见构象的已知结构,我们将其称为部分结构。据我们所知,我们提出了第一个基于变压器的模型,该模型直接利用实验性蛋白质晶体学数据和部分结构信息来计算蛋白质的电子密度图。特别是,我们使用帕特森图,它可以直接从X射线晶体学实验数据中获得,从而绕过了众所周知的晶体学相位问题。我们证明,我们的方法CrysFormer在两个结晶形式的肽片段合成数据集上实现了精确预测,一个数据集每个晶胞有两个残基,另一个有十五个。然后可以使用既定的晶体学精修程序,利用这些预测生成精确的原子模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d26/11326852/3cd578fd9552/SDTYAE-000011-044701_1-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验