Suppr超能文献

通过强化学习在蛋白质相互作用网络中检测分子复合物。

Molecular complex detection in protein interaction networks through reinforcement learning.

机构信息

Department of Molecular Biosciences, Center for Systems and Synthetic Biology, University of Texas, Austin, TX, 78712, USA.

Oden Institute for Computational Engineering and Sciences, University of Texas, Austin, TX, 78712, USA.

出版信息

BMC Bioinformatics. 2023 Aug 2;24(1):306. doi: 10.1186/s12859-023-05425-7.

Abstract

BACKGROUND

Proteins often assemble into higher-order complexes to perform their biological functions. Such protein-protein interactions (PPI) are often experimentally measured for pairs of proteins and summarized in a weighted PPI network, to which community detection algorithms can be applied to define the various higher-order protein complexes. Current methods include unsupervised and supervised approaches, often assuming that protein complexes manifest only as dense subgraphs. Utilizing supervised approaches, the focus is not on how to find them in a network, but only on learning which subgraphs correspond to complexes, currently solved using heuristics. However, learning to walk trajectories on a network to identify protein complexes leads naturally to a reinforcement learning (RL) approach, a strategy not extensively explored for community detection. Here, we develop and evaluate a reinforcement learning pipeline for community detection on weighted protein-protein interaction networks to detect new protein complexes. The algorithm is trained to calculate the value of different subgraphs encountered while walking on the network to reconstruct known complexes. A distributed prediction algorithm then scales the RL pipeline to search for novel protein complexes on large PPI networks.

RESULTS

The reinforcement learning pipeline is applied to a human PPI network consisting of 8k proteins and 60k PPI, which results in 1,157 protein complexes. The method demonstrated competitive accuracy with improved speed compared to previous algorithms. We highlight protein complexes such as C4orf19, C18orf21, and KIAA1522 which are currently minimally characterized. Additionally, the results suggest TMC04 be a putative additional subunit of the KICSTOR complex and confirm the involvement of C15orf41 in a higher-order complex with HIRA, CDAN1, ASF1A, and by 3D structural modeling.

CONCLUSIONS

Reinforcement learning offers several distinct advantages for community detection, including scalability and knowledge of the walk trajectories defining those communities. Applied to currently available human protein interaction networks, this method had comparable accuracy with other algorithms and notable savings in computational time, and in turn, led to clear predictions of protein function and interactions for several uncharacterized human proteins.

摘要

背景

蛋白质经常组装成更高阶的复合物来执行其生物功能。这些蛋白质-蛋白质相互作用(PPI)通常是通过实验测量成对蛋白质之间的相互作用,并总结在加权 PPI 网络中,然后可以应用社区检测算法来定义各种高阶蛋白质复合物。当前的方法包括无监督和监督方法,通常假设蛋白质复合物仅表现为密集子图。利用监督方法,重点不是如何在网络中找到它们,而是仅学习哪些子图对应于复合物,目前使用启发式方法解决。然而,学习在网络上行走轨迹以识别蛋白质复合物自然会导致强化学习(RL)方法,这是一种尚未广泛探索用于社区检测的策略。在这里,我们开发并评估了一种用于加权蛋白质-蛋白质相互作用网络上社区检测的强化学习管道,以检测新的蛋白质复合物。该算法经过训练,可以计算在网络上行走时遇到的不同子图的价值,以重建已知的复合物。然后,分布式预测算法将 RL 管道扩展到大型 PPI 网络上搜索新的蛋白质复合物。

结果

强化学习管道应用于由 8k 种蛋白质和 60k 个 PPI 组成的人类 PPI 网络,结果得到 1157 个蛋白质复合物。该方法与之前的算法相比,具有竞争力的准确性和更快的速度。我们突出了一些蛋白质复合物,如 C4orf19、C18orf21 和 KIAA1522,它们目前的特征最少。此外,结果表明 TMC04 可能是 KICSTOR 复合物的另一个假定亚基,并通过 3D 结构建模证实了 C15orf41 与 HIRA、CDAN1、ASF1A 更高阶复合物的参与。

结论

强化学习为社区检测提供了几个明显的优势,包括可扩展性和定义这些社区的行走轨迹的知识。将其应用于当前可用的人类蛋白质相互作用网络,该方法与其他算法具有相当的准确性,并且在计算时间上有显著的节省,进而对几个未表征的人类蛋白质的蛋白质功能和相互作用做出了明确的预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3def/10394916/d28061611cfb/12859_2023_5425_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验