Suppr超能文献

通过RoseNet深度学习框架对双插入突变体进行能量指标预测。

Energy metric prediction for double insertion mutants via the RoseNet deep learning framework.

作者信息

Coffland Sarah, Christensen Katie, Hutchinson Brian, Jagodzinski Filip

机构信息

Computer Science Department, Western Washington University, Washington, 98225, United States.

Joint Global Change Research Institute, Pacific Northwest National Laboratory, Maryland, 20740, United States.

出版信息

Bioinform Adv. 2025 Jan 2;5(1):vbae198. doi: 10.1093/bioadv/vbae198. eCollection 2025.

Abstract

SUMMARY

Studying the structural and functional implications of protein mutations is an important task in computational biology and bioinformatics. We leverage our previously proposed RoseNet neural network architecture to predict energy metrics of proteins with double amino acid insertions or deletions (InDels). We train models on previously generated benchmark datasets containing the exhaustive double InDel mutations for three proteins, as well as an additional three proteins for which random mutants, each with two InDels, have been generated. We expand on our previous work by evaluating three additional proteins and analyzing domain features that impact the prediction capabilities of RoseNet. These features include InDels into secondary structures and the solvent accessible surface area (SASA) scores of the residues. We uncover further evidence to support that RoseNet has a higher proficiency of generalizing to unseen residue combinations than unseen insertion positions. We also observe that RoseNet produces higher-quality predictions when inserting into a -sheet over an -helix. Additionally, when the insertions fall in an area of high SASA, RoseNet often displays better performance than inserting into areas of low SASA.

AVAILABILITY AND IMPLEMENTATION

The code used for training and evaluating the models in the study and the data underlying this article are available at https://github.com/hutchresearch/RoseNet.

摘要

摘要

研究蛋白质突变的结构和功能影响是计算生物学和生物信息学中的一项重要任务。我们利用之前提出的RoseNet神经网络架构来预测具有双氨基酸插入或缺失(InDels)的蛋白质的能量指标。我们在之前生成的基准数据集上训练模型,该数据集包含三种蛋白质的详尽双InDel突变,以及另外三种已生成随机突变体(每个突变体有两个InDels)的蛋白质。我们通过评估另外三种蛋白质并分析影响RoseNet预测能力的结构域特征来扩展我们之前的工作。这些特征包括二级结构中的InDels以及残基的溶剂可及表面积(SASA)分数。我们发现了进一步的证据来支持RoseNet对未见残基组合的泛化能力高于对未见插入位置的泛化能力。我们还观察到,当插入到β折叠中时,RoseNet比插入到α螺旋中能产生更高质量的预测。此外,当插入落在高SASA区域时,RoseNet的性能通常比插入到低SASA区域更好。

可用性和实现

本研究中用于训练和评估模型的代码以及本文所依据的数据可在https://github.com/hutchresearch/RoseNet上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/29ec/12133284/55b8e3936221/vbae198f1.jpg

相似文献

1
Energy metric prediction for double insertion mutants via the RoseNet deep learning framework.
Bioinform Adv. 2025 Jan 2;5(1):vbae198. doi: 10.1093/bioadv/vbae198. eCollection 2025.
3
Interventions for central serous chorioretinopathy: a network meta-analysis.
Cochrane Database Syst Rev. 2025 Jun 16;6(6):CD011841. doi: 10.1002/14651858.CD011841.pub3.
4
Pharmacological and electronic cigarette interventions for smoking cessation in adults: component network meta-analyses.
Cochrane Database Syst Rev. 2023 Sep 12;9(9):CD015226. doi: 10.1002/14651858.CD015226.pub2.
5
Stigma Management Strategies of Autistic Social Media Users.
Autism Adulthood. 2025 May 28;7(3):273-282. doi: 10.1089/aut.2023.0095. eCollection 2025 Jun.
6
Aural toilet (ear cleaning) for chronic suppurative otitis media.
Cochrane Database Syst Rev. 2025 Jun 9;6(6):CD013057. doi: 10.1002/14651858.CD013057.pub3.
7
Electronic cigarettes for smoking cessation.
Cochrane Database Syst Rev. 2025 Jan 29;1(1):CD010216. doi: 10.1002/14651858.CD010216.pub9.
8
Electronic cigarettes for smoking cessation.
Cochrane Database Syst Rev. 2024 Jan 8;1(1):CD010216. doi: 10.1002/14651858.CD010216.pub8.
9
"Just Ask What Support We Need": Autistic Adults' Feedback on Social Skills Training.
Autism Adulthood. 2025 May 28;7(3):283-292. doi: 10.1089/aut.2023.0136. eCollection 2025 Jun.
10
Prognostic factors for return to work in breast cancer survivors.
Cochrane Database Syst Rev. 2025 May 7;5(5):CD015124. doi: 10.1002/14651858.CD015124.pub2.

本文引用的文献

1
Enzyme structure correlates with variant effect predictability.
Comput Struct Biotechnol J. 2024 Oct 2;23:3489-3497. doi: 10.1016/j.csbj.2024.09.007. eCollection 2024 Dec.
2
An algorithm for drug discovery based on deep learning with an example of developing a drug for the treatment of lung cancer.
Front Bioinform. 2023 Nov 9;3:1225149. doi: 10.3389/fbinf.2023.1225149. eCollection 2023.
3
Efficient evolution of human antibodies from general protein language models.
Nat Biotechnol. 2024 Feb;42(2):275-283. doi: 10.1038/s41587-023-01763-2. Epub 2023 Apr 24.
4
Elucidating the Structural Impacts of Protein InDels.
Biomolecules. 2022 Oct 7;12(10):1435. doi: 10.3390/biom12101435.
6
Accurate prediction of protein structures and interactions using a three-track neural network.
Science. 2021 Aug 20;373(6557):871-876. doi: 10.1126/science.abj8754. Epub 2021 Jul 15.
7
Assessing multiple score functions in Rosetta for drug discovery.
PLoS One. 2020 Oct 12;15(10):e0240450. doi: 10.1371/journal.pone.0240450. eCollection 2020.
8
Macromolecular modeling and design in Rosetta: recent methods and frameworks.
Nat Methods. 2020 Jul;17(7):665-680. doi: 10.1038/s41592-020-0848-2. Epub 2020 Jun 1.
9
Rosetta custom score functions accurately predict ΔΔG of mutations at protein-protein interfaces using machine learning.
Chem Commun (Camb). 2020 Jun 25;56(50):6774-6777. doi: 10.1039/d0cc01959c. Epub 2020 May 22.
10
The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design.
J Chem Theory Comput. 2017 Jun 13;13(6):3031-3048. doi: 10.1021/acs.jctc.7b00125. Epub 2017 May 12.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验