Suppr超能文献

化学信息机器学习解释钙结合蛋白的模糊形状以传递钙离子原子状态的变化。

Chemistry-informed Machine Learning Explains Calcium-binding Proteins' Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions.

作者信息

Zhang Pengzhi, Nde Jules, Eliaz Yossi, Jennings Nathaniel, Cieplak Piotr, Cheung Margaret S

机构信息

Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA.

Department of Physics, University of Washington, Seattle, WA, USA.

出版信息

ArXiv. 2024 Jul 24:arXiv:2407.17017v1.

Abstract

Proteins' fuzziness are features for communicating changes in cell signaling instigated by binding with secondary messengers, such as calcium ions, associated with the coordination of muscle contraction, neurotransmitter release, and gene expression. Binding with the disordered parts of a protein, calcium ions must balance their charge states with the shape of calcium-binding proteins and their versatile pool of partners depending on the circumstances they transmit, but it is unclear whether the limited experimental data available can be used to train models to accurately predict the charges of calcium-binding protein variants. Here, we developed a chemistry-informed, machine-learning algorithm that implements a game theoretic approach to explain the output of a machine-learning model without the prerequisite of an excessively large database for high-performance prediction of atomic charges. We used the electronic structure data representing calcium ions and the structures of the disordered segments of calcium-binding peptides with surrounding water molecules to train several explainable models. Network theory was used to extract the topological features of atomic interactions in the structurally complex data dictated by the coordination chemistry of a calcium ion, a potent indicator of its charge state in protein. With our designs, we provided a framework of explainable machine learning model to annotate atomic charges of calcium ions in calcium-binding proteins with domain knowledge in response to the chemical changes in an environment based on the limited size of scientific data in a genome space.

摘要

蛋白质的模糊性是与二级信使(如钙离子)结合所引发的细胞信号传导变化的特征,这些变化与肌肉收缩、神经递质释放和基因表达的协调相关。钙离子与蛋白质的无序部分结合时,必须根据它们所传递的环境,通过钙结合蛋白的形状及其多样的伙伴库来平衡其电荷状态,但目前尚不清楚现有的有限实验数据是否可用于训练模型,以准确预测钙结合蛋白变体的电荷。在此,我们开发了一种基于化学知识的机器学习算法,该算法采用博弈论方法来解释机器学习模型的输出,而无需用于高性能预测原子电荷的超大数据库。我们使用表示钙离子的电子结构数据以及钙结合肽无序片段与周围水分子的结构来训练多个可解释模型。网络理论用于提取由钙离子的配位化学所决定的结构复杂数据中原子相互作用的拓扑特征,钙离子的配位化学是其在蛋白质中电荷状态的有力指标。通过我们的设计,基于基因组空间中科学数据的有限规模,我们提供了一个可解释机器学习模型框架,以利用领域知识注释钙结合蛋白中钙离子的原子电荷,从而响应环境中的化学变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8556/11302678/ab450c844920/nihpp-2407.17017v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验