Suppr超能文献

基于深度哈希学习的全对全蛋白质-蛋白质相互作用识别。

Identification of all-against-all protein-protein interactions based on deep hash learning.

机构信息

College of Computer and Cyber Security, Fujian Normal University, Fuzhou, 350108, People's Republic of China.

No. 2 Thoracic Surgery Department Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, 101149, People's Republic of China.

出版信息

BMC Bioinformatics. 2022 Jul 8;23(1):266. doi: 10.1186/s12859-022-04811-x.

Abstract

BACKGROUND

Protein-protein interaction (PPI) is vital for life processes, disease treatment, and drug discovery. The computational prediction of PPI is relatively inexpensive and efficient when compared to traditional wet-lab experiments. Given a new protein, one may wish to find whether the protein has any PPI relationship with other existing proteins. Current computational PPI prediction methods usually compare the new protein to existing proteins one by one in a pairwise manner. This is time consuming.

RESULTS

In this work, we propose a more efficient model, called deep hash learning protein-and-protein interaction (DHL-PPI), to predict all-against-all PPI relationships in a database of proteins. First, DHL-PPI encodes a protein sequence into a binary hash code based on deep features extracted from the protein sequences using deep learning techniques. This encoding scheme enables us to turn the PPI discrimination problem into a much simpler searching problem. The binary hash code for a protein sequence can be regarded as a number. Thus, in the pre-screening stage of DHL-PPI, the string matching problem of comparing a protein sequence against a database with M proteins can be transformed into a much more simpler problem: to find a number inside a sorted array of length M. This pre-screening process narrows down the search to a much smaller set of candidate proteins for further confirmation. As a final step, DHL-PPI uses the Hamming distance to verify the final PPI relationship.

CONCLUSIONS

The experimental results confirmed that DHL-PPI is feasible and effective. Using a dataset with strictly negative PPI examples of four species, DHL-PPI is shown to be superior or competitive when compared to the other state-of-the-art methods in terms of precision, recall or F1 score. Furthermore, in the prediction stage, the proposed DHL-PPI reduced the time complexity from [Formula: see text] to [Formula: see text] for performing an all-against-all PPI prediction for a database with M proteins. With the proposed approach, a protein database can be preprocessed and stored for later search using the proposed encoding scheme. This can provide a more efficient way to cope with the rapidly increasing volume of protein datasets.

摘要

背景

蛋白质-蛋白质相互作用(PPI)对生命过程、疾病治疗和药物发现至关重要。与传统的湿实验室实验相比,计算预测 PPI 相对便宜且高效。给定一种新的蛋白质,人们可能希望确定该蛋白质与其他现有蛋白质是否存在 PPI 关系。当前的计算 PPI 预测方法通常是逐个比较新蛋白质与现有蛋白质。这是耗时的。

结果

在这项工作中,我们提出了一种更有效的模型,称为深度哈希学习蛋白质-蛋白质相互作用(DHL-PPI),以预测蛋白质数据库中的所有对所有 PPI 关系。首先,DHL-PPI 根据从蛋白质序列中提取的深度学习特征,将蛋白质序列编码为二进制哈希码。这种编码方案使我们能够将 PPI 判别问题转化为更简单的搜索问题。蛋白质序列的二进制哈希码可以看作一个数字。因此,在 DHL-PPI 的预筛选阶段,将一个蛋白质序列与包含 M 个蛋白质的数据库进行比较的字符串匹配问题可以转化为更简单的问题:在长度为 M 的排序数组中查找一个数字。这个预筛选过程将搜索范围缩小到更小的候选蛋白质集,以进行进一步确认。作为最后一步,DHL-PPI 使用汉明距离来验证最终的 PPI 关系。

结论

实验结果证实了 DHL-PPI 的可行性和有效性。使用来自四个物种的严格负 PPI 示例数据集,与其他最先进的方法相比,DHL-PPI 在精度、召回率或 F1 得分方面表现出优势或竞争力。此外,在预测阶段,所提出的 DHL-PPI 将时间复杂度从 [公式:见文本] 降低到 [公式:见文本],用于对具有 M 个蛋白质的数据库执行所有对所有 PPI 预测。通过使用所提出的编码方案,蛋白质数据库可以在预处理后存储以备将来搜索。这可以提供一种更有效的方法来应对快速增长的蛋白质数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2706/9264577/f4babc3d7377/12859_2022_4811_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验