• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

玻尔兹曼机学习与正则化方法在从多重序列比对推断进化场与耦合中的应用。

Boltzmann Machine Learning and Regularization Methods for Inferring Evolutionary Fields and Couplings From a Multiple Sequence Alignment.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):328-342. doi: 10.1109/TCBB.2020.2993232. Epub 2022 Feb 3.

DOI:10.1109/TCBB.2020.2993232
PMID:32396099
Abstract

The inverse Potts problem to infer a Boltzmann distribution for homologous protein sequences from their single-site and pairwise amino acid frequencies recently attracts a great deal of attention in the studies of protein structure and evolution. We study regularization and learning methods and how to tune regularization parameters to correctly infer interactions in Boltzmann machine learning. Using L regularization for fields, group L for couplings is shown to be very effective for sparse couplings in comparison with L and L. Two regularization parameters are tuned to yield equal values for both the sample and ensemble averages of evolutionary energy. Both averages smoothly change and converge, but their learning profiles are very different between learning methods. The Adam method is modified to make stepsize proportional to the gradient for sparse couplings and to use a soft-thresholding function for group L. It is shown by first inferring interactions from protein sequences and then from Monte Carlo samples that the fields and couplings can be well recovered, but that recovering the pairwise correlations in the resolution of a total energy is harder for the natural proteins than for the protein-like sequences. Selective temperature for folding/structural constrains in protein evolution is also estimated.

摘要

最近,从同源蛋白质序列的单点和成对氨基酸频率推断玻尔兹曼分布的逆 Potts 问题在蛋白质结构和进化的研究中引起了极大的关注。我们研究了正则化和学习方法,以及如何调整正则化参数以正确推断玻尔兹曼机器学习中的相互作用。使用 L 正则化场,与 L 和 L 相比,组 L 对耦合的正则化非常有效,适用于稀疏耦合。两个正则化参数被调整为使进化能量的样本平均值和集合平均值具有相等的值。两个平均值平滑地变化并收敛,但学习方法之间的学习曲线非常不同。对 Adam 方法进行了修改,使其步长与稀疏耦合的梯度成正比,并对组 L 使用软阈值函数。通过首先从蛋白质序列推断相互作用,然后从蒙特卡罗样本推断相互作用,表明可以很好地恢复场和耦合,但对于天然蛋白质,在解析总能量时恢复成对相关性比蛋白质样序列更难。还估计了蛋白质进化中折叠/结构约束的选择性温度。

相似文献

1
Boltzmann Machine Learning and Regularization Methods for Inferring Evolutionary Fields and Couplings From a Multiple Sequence Alignment.玻尔兹曼机学习与正则化方法在从多重序列比对推断进化场与耦合中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):328-342. doi: 10.1109/TCBB.2020.2993232. Epub 2022 Feb 3.
2
Selection originating from protein stability/foldability: Relationships between protein folding free energy, sequence ensemble, and fitness.源于蛋白质稳定性/可折叠性的选择:蛋白质折叠自由能、序列集合与适应性之间的关系。
J Theor Biol. 2017 Nov 21;433:21-38. doi: 10.1016/j.jtbi.2017.08.018. Epub 2017 Aug 24.
3
PPalign: optimal alignment of Potts models representing proteins with direct coupling information.PPalign:具有直接耦合信息的 Potts 模型代表蛋白质的最佳对齐。
BMC Bioinformatics. 2021 Jun 10;22(1):317. doi: 10.1186/s12859-021-04222-4.
4
Low-dose CT reconstruction via L1 dictionary learning regularization using iteratively reweighted least-squares.基于迭代重加权最小二乘法的 L1 字典学习正则化的低剂量 CT 重建
Biomed Eng Online. 2016 Jun 18;15(1):66. doi: 10.1186/s12938-016-0193-y.
5
Benchmarking Inverse Statistical Approaches for Protein Structure and Design with Exactly Solvable Models.使用精确可解模型对蛋白质结构和设计的逆统计方法进行基准测试。
PLoS Comput Biol. 2016 May 13;12(5):e1004889. doi: 10.1371/journal.pcbi.1004889. eCollection 2016 May.
6
Sparse generative modeling via parameter reduction of Boltzmann machines: Application to protein-sequence families.通过玻尔兹曼机的参数约简进行稀疏生成建模:在蛋白质序列家族中的应用。
Phys Rev E. 2021 Aug;104(2-1):024407. doi: 10.1103/PhysRevE.104.024407.
7
How Pairwise Coevolutionary Models Capture the Collective Residue Variability in Proteins?成对协同进化模型如何捕捉蛋白质中的集体残基变异性?
Mol Biol Evol. 2018 Apr 1;35(4):1018-1027. doi: 10.1093/molbev/msy007.
8
adabmDCA: adaptive Boltzmann machine learning for biological sequences.adabmDCA:用于生物序列的自适应玻尔兹曼机学习。
BMC Bioinformatics. 2021 Oct 29;22(1):528. doi: 10.1186/s12859-021-04441-9.
9
L1-norm locally linear representation regularization multi-source adaptation learning.L1 范数局部线性表示正则化多源自适应学习。
Neural Netw. 2015 Sep;69:80-98. doi: 10.1016/j.neunet.2015.01.009. Epub 2015 Feb 25.
10
Pairwise Constraint-Guided Sparse Learning for Feature Selection.基于成对约束的稀疏学习特征选择。
IEEE Trans Cybern. 2016 Jan;46(1):298-310. doi: 10.1109/TCYB.2015.2401733. Epub 2015 Jul 6.