• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度协同过滤在疾病基因预测中的应用。

Deep Collaborative Filtering for Prediction of Disease Genes.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1639-1647. doi: 10.1109/TCBB.2019.2907536. Epub 2019 Mar 26.

DOI:10.1109/TCBB.2019.2907536
PMID:30932845
Abstract

Accurate prioritization of potential disease genes is a fundamental challenge in biomedical research. Various algorithms have been developed to solve such problems. Inductive Matrix Completion (IMC) is one of the most reliable models for its well-established framework and its superior performance in predicting gene-disease associations. However, the IMC method does not hierarchically extract deep features, which might limit the quality of recovery. In this case, the architecture of deep learning, which obtains high-level representations and handles noises and outliers presented in large-scale biological datasets, is introduced into the side information of genes in our Deep Collaborative Filtering (DCF) model. Further, for lack of negative examples, we also exploit Positive-Unlabeled (PU) learning formulation to low-rank matrix completion. Our approach achieves substantially improved performance over other state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database. Our approach is 10 percent more efficient than standard IMC in detecting a true association, and significantly outperforms other alternatives in terms of the precision-recall metric at the top-k predictions. Moreover, we also validate the disease with no previously known gene associations and newly reported OMIM associations. The experimental results show that DCF is still satisfactory for ranking novel disease phenotypes as well as mining unexplored relationships. The source code and the data are available at https://github.com/xzenglab/DCF.

摘要

准确地确定潜在疾病基因的优先级是生物医学研究中的一个基本挑战。已经开发了各种算法来解决这些问题。归纳矩阵补全(IMC)是最可靠的模型之一,因为它具有成熟的框架和在预测基因-疾病关联方面的卓越性能。然而,IMC 方法没有分层提取深层特征,这可能会限制恢复的质量。在这种情况下,深度学习的架构被引入到我们的深度协同过滤(DCF)模型的基因侧信息中,该架构可以获取高级表示,并处理大规模生物数据集呈现的噪声和异常值。此外,由于缺乏负例,我们还利用正未标记(PU)学习公式进行低秩矩阵补全。我们的方法在 OMIM 数据库中的疾病上的表现明显优于其他最先进的方法。与标准 IMC 相比,我们的方法在检测真实关联方面的效率提高了 10%,并且在 top-k 预测方面的精度-召回率指标上明显优于其他替代方法。此外,我们还验证了以前没有已知基因关联的疾病和新报告的 OMIM 关联。实验结果表明,DCF 在对新的疾病表型进行排序以及挖掘未开发的关系方面仍然令人满意。源代码和数据可在 https://github.com/xzenglab/DCF 上获得。

相似文献

1
Deep Collaborative Filtering for Prediction of Disease Genes.深度协同过滤在疾病基因预测中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Sep-Oct;17(5):1639-1647. doi: 10.1109/TCBB.2019.2907536. Epub 2019 Mar 26.
2
Stable solution to l -based robust inductive matrix completion and its application in linking long noncoding RNAs to human diseases.基于 l 的稳健感应矩阵补全的稳定解及其在将长非编码 RNA 与人类疾病联系起来的应用。
BMC Med Genomics. 2017 Dec 28;10(Suppl 5):77. doi: 10.1186/s12920-017-0310-1.
3
Robust Inductive Matrix Completion Strategy to Explore Associations Between LincRNAs and Human Disease Phenotypes.稳健归纳矩阵完成策略,探索 lincRNAs 与人类疾病表型之间的关联。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Nov-Dec;16(6):2066-2077. doi: 10.1109/TCBB.2018.2844816. Epub 2018 Jun 7.
4
Inductive matrix completion for predicting gene-disease associations.基于归纳的矩阵补全算法预测基因-疾病关联
Bioinformatics. 2014 Jun 15;30(12):i60-68. doi: 10.1093/bioinformatics/btu269.
5
Prioritization of candidate disease genes by enlarging the seed set and fusing information of the network topology and gene expression.通过扩大种子集并融合网络拓扑结构和基因表达信息来对候选疾病基因进行优先级排序。
Mol Biosyst. 2014 Jun;10(6):1400-8. doi: 10.1039/c3mb70588a. Epub 2014 Apr 3.
6
SDLDA: lncRNA-disease association prediction based on singular value decomposition and deep learning.SDLDA:基于奇异值分解和深度学习的 lncRNA-疾病关联预测。
Methods. 2020 Jul 1;179:73-80. doi: 10.1016/j.ymeth.2020.05.002. Epub 2020 May 5.
7
Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model.使用稳健的多网络模型整合组织特异性分子网络进行疾病基因优先级排序。
BMC Bioinformatics. 2016 Nov 10;17(1):453. doi: 10.1186/s12859-016-1317-x.
8
Ensemble positive unlabeled learning for disease gene identification.用于疾病基因识别的集成正无标记学习
PLoS One. 2014 May 9;9(5):e97079. doi: 10.1371/journal.pone.0097079. eCollection 2014.
9
CKG-IMC: An inductive matrix completion method enhanced by CKG and GNN for Alzheimer's disease compound-protein interactions prediction.CKG-IMC:一种基于 CKG 和 GNN 的诱导矩阵补全方法,用于阿尔茨海默病化合物-蛋白质相互作用预测。
Comput Biol Med. 2024 Jul;177:108612. doi: 10.1016/j.compbiomed.2024.108612. Epub 2024 May 14.
10
Enhancing the prediction of disease-gene associations with multimodal deep learning.利用多模态深度学习增强疾病-基因关联的预测。
Bioinformatics. 2019 Oct 1;35(19):3735-3742. doi: 10.1093/bioinformatics/btz155.

引用本文的文献

1
Heterogeneous biomedical entity representation learning for gene-disease association prediction.基于异质生物医学实体表示学习的基因-疾病关联预测。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae380.
2
A novel technique for leaf disease classification using Legion Kernels with parallel support vector machine (LK-PSVM) and fuzzy C means image segmentation.一种使用军团核与并行支持向量机(LK - PSVM)及模糊C均值图像分割的叶片病害分类新技术。
Heliyon. 2024 Jun 9;10(12):e32707. doi: 10.1016/j.heliyon.2024.e32707. eCollection 2024 Jun 30.
3
A GHKNN model based on the physicochemical property extraction method to identify SNARE proteins.
一种基于物理化学性质提取方法的GHKNN模型,用于识别SNARE蛋白。
Front Genet. 2022 Nov 23;13:935717. doi: 10.3389/fgene.2022.935717. eCollection 2022.
4
iAIPs: Identifying Anti-Inflammatory Peptides Using Random Forest.iAIPs:使用随机森林识别抗炎肽
Front Genet. 2021 Nov 30;12:773202. doi: 10.3389/fgene.2021.773202. eCollection 2021.
5
Application of Machine Learning for Drug-Target Interaction Prediction.机器学习在药物-靶点相互作用预测中的应用。
Front Genet. 2021 Jun 21;12:680117. doi: 10.3389/fgene.2021.680117. eCollection 2021.
6
Stable DNA Sequence Over Close-Ending and Pairing Sequences Constraint.封闭末端和配对序列约束下的稳定DNA序列
Front Genet. 2021 May 17;12:644484. doi: 10.3389/fgene.2021.644484. eCollection 2021.
7
RNA-Associated Co-expression Network Identifies Novel Biomarkers for Digestive System Cancer.RNA相关共表达网络鉴定出消化系统癌症的新型生物标志物。
Front Genet. 2021 Mar 26;12:659788. doi: 10.3389/fgene.2021.659788. eCollection 2021.
8
Accurate identification of RNA D modification using multiple features.使用多种特征准确识别 RNA D 修饰。
RNA Biol. 2021 Dec;18(12):2236-2246. doi: 10.1080/15476286.2021.1898160. Epub 2021 Mar 17.
9
HSM6AP: a high-precision predictor for the Homo N6-methyladenosine (m^6 A) based on multiple weights and feature stitching.HSM6AP:一种基于多重权重和特征拼接的高精度 Homo N6-甲基腺苷(m^6 A)预测器。
RNA Biol. 2021 Nov;18(11):1882-1892. doi: 10.1080/15476286.2021.1875180. Epub 2021 Feb 12.
10
A Method for Prediction of Thermophilic Protein Based on Reduced Amino Acids and Mixed Features.一种基于简化氨基酸和混合特征的嗜热蛋白预测方法。
Front Bioeng Biotechnol. 2020 May 5;8:285. doi: 10.3389/fbioe.2020.00285. eCollection 2020.