• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整合多个数据源来识别疾病基因。

Identifying disease genes by integrating multiple data sources.

作者信息

Chen Bolin, Wang Jianxin, Li Min, Wu Fang-Xiang

出版信息

BMC Med Genomics. 2014;7 Suppl 2(Suppl 2):S2. doi: 10.1186/1755-8794-7-S2-S2. Epub 2014 Oct 22.

DOI:10.1186/1755-8794-7-S2-S2
PMID:25350511
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4243092/
Abstract

BACKGROUND

Now multiple types of data are available for identifying disease genes. Those data include gene-disease associations, disease phenotype similarities, protein-protein interactions, pathways, gene expression profiles, etc.. It is believed that integrating different kinds of biological data is an effective method to identify disease genes.

RESULTS

In this paper, we propose a multiple data integration method based on the theory of Markov random field (MRF) and the method of Bayesian analysis for identifying human disease genes. The proposed method is not only flexible in easily incorporating different kinds of data, but also reliable in predicting candidate disease genes.

CONCLUSIONS

Numerical experiments are carried out by integrating known gene-disease associations, protein complexes, protein-protein interactions, pathways and gene expression profiles. Predictions are evaluated by the leave-one-out method. The proposed method achieves an AUC score of 0.743 when integrating all those biological data in our experiments.

摘要

背景

目前有多种类型的数据可用于识别疾病基因。这些数据包括基因与疾病的关联、疾病表型相似性、蛋白质-蛋白质相互作用、信号通路、基因表达谱等。人们认为整合不同类型的生物学数据是识别疾病基因的有效方法。

结果

在本文中,我们提出了一种基于马尔可夫随机场(MRF)理论和贝叶斯分析方法的多数据整合方法,用于识别人类疾病基因。所提出的方法不仅在轻松整合不同类型的数据方面具有灵活性,而且在预测候选疾病基因方面也具有可靠性。

结论

通过整合已知的基因与疾病关联、蛋白质复合物、蛋白质-蛋白质相互作用、信号通路和基因表达谱进行了数值实验。通过留一法对预测结果进行评估。在我们的实验中,当整合所有这些生物学数据时,所提出的方法获得了0.743的AUC分数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/31759d1f1bf9/1755-8794-7-S2-S2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/376f898da78a/1755-8794-7-S2-S2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/646672675f34/1755-8794-7-S2-S2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/ff7ff9b54b4d/1755-8794-7-S2-S2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/62d609250eb0/1755-8794-7-S2-S2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/31759d1f1bf9/1755-8794-7-S2-S2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/376f898da78a/1755-8794-7-S2-S2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/646672675f34/1755-8794-7-S2-S2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/ff7ff9b54b4d/1755-8794-7-S2-S2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/62d609250eb0/1755-8794-7-S2-S2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae6e/4243092/31759d1f1bf9/1755-8794-7-S2-S2-5.jpg

相似文献

1
Identifying disease genes by integrating multiple data sources.通过整合多个数据源来识别疾病基因。
BMC Med Genomics. 2014;7 Suppl 2(Suppl 2):S2. doi: 10.1186/1755-8794-7-S2-S2. Epub 2014 Oct 22.
2
Disease gene identification by using graph kernels and Markov random fields.利用图核和马尔可夫随机场进行疾病基因识别。
Sci China Life Sci. 2014 Nov;57(11):1054-63. doi: 10.1007/s11427-014-4745-8. Epub 2014 Oct 17.
3
A Markov random field model for network-based analysis of genomic data.一种用于基于网络的基因组数据分析的马尔可夫随机场模型。
Bioinformatics. 2007 Jun 15;23(12):1537-44. doi: 10.1093/bioinformatics/btm129. Epub 2007 May 5.
4
Pinpointing disease genes through phenomic and genomic data fusion.通过表型组学和基因组学数据融合来精准定位疾病基因。
BMC Genomics. 2015;16 Suppl 2(Suppl 2):S3. doi: 10.1186/1471-2164-16-S2-S3. Epub 2015 Jan 21.
5
A fast and high performance multiple data integration algorithm for identifying human disease genes.一种用于识别人类疾病基因的快速高效多数据整合算法。
BMC Med Genomics. 2015;8 Suppl 3(Suppl 3):S2. doi: 10.1186/1755-8794-8-S3-S2. Epub 2015 Sep 23.
6
Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach.整合多个蛋白质-蛋白质相互作用网络以优先考虑疾病基因:一种贝叶斯回归方法。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S11. doi: 10.1186/1471-2105-12-S1-S11.
7
Pathway-based Bayesian inference of drug-disease interactions.基于通路的药物-疾病相互作用的贝叶斯推断
Mol Biosyst. 2014 Jun;10(6):1538-48. doi: 10.1039/c4mb00014e. Epub 2014 Apr 3.
8
A framework of integrating gene relations from heterogeneous data sources: an experiment on Arabidopsis thaliana.一个从异构数据源整合基因关系的框架:拟南芥实验
Bioinformatics. 2006 Aug 15;22(16):2037-43. doi: 10.1093/bioinformatics/btl345. Epub 2006 Jul 4.
9
Genome-scale protein function prediction in yeast Saccharomyces cerevisiae through integrating multiple sources of high-throughput data.通过整合多种高通量数据源预测酿酒酵母中的全基因组蛋白质功能
Pac Symp Biocomput. 2005:471-82.
10
Identifying drug active pathways from gene networks estimated by gene expression data.从由基因表达数据估计的基因网络中识别药物活性通路。
Genome Inform. 2005;16(1):182-91.

引用本文的文献

1
Gemini: memory-efficient integration of hundreds of gene networks with high-order pooling.双子星:高效集成数百个基因网络的方法,支持高阶池化。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i504-i512. doi: 10.1093/bioinformatics/btad247.
2
HetIG-PreDiG: A Heterogeneous Integrated Graph Model for Predicting Human Disease Genes based on gene expression.HetIG-PreDiG:一种基于基因表达的用于预测人类疾病基因的异构集成图模型。
PLoS One. 2023 Feb 15;18(2):e0280839. doi: 10.1371/journal.pone.0280839. eCollection 2023.
3
Gene Network Analysis of Alzheimer's Disease Based on Network and Statistical Methods.

本文引用的文献

1
Identifying protein complexes based on multiple topological structures in PPI networks.基于 PPI 网络中的多种拓扑结构识别蛋白质复合物。
IEEE Trans Nanobioscience. 2013 Sep;12(3):165-72. doi: 10.1109/TNB.2013.2264097. Epub 2013 Aug 21.
2
PCDq: human protein complex database with quality index which summarizes different levels of evidences of protein complexes predicted from h-invitational protein-protein interactions integrative dataset.PCDq:具有质量指数的人类蛋白质复合物数据库,该指数总结了从h-invitational蛋白质-蛋白质相互作用整合数据集中预测的蛋白质复合物不同层次的证据。
BMC Syst Biol. 2012;6 Suppl 2(Suppl 2):S7. doi: 10.1186/1752-0509-6-S2-S7. Epub 2012 Dec 12.
3
基于网络和统计方法的阿尔茨海默病基因网络分析
Entropy (Basel). 2021 Oct 19;23(10):1365. doi: 10.3390/e23101365.
4
Identifying Breast Cancer-Related Genes Based on a Novel Computational Framework Involving KEGG Pathways and PPI Network Modularity.基于涉及KEGG通路和PPI网络模块性的新型计算框架识别乳腺癌相关基因。
Front Genet. 2021 Aug 16;12:596794. doi: 10.3389/fgene.2021.596794. eCollection 2021.
5
Ensemble disease gene prediction by clinical sample-based networks.基于临床样本的网络进行疾病基因综合预测。
BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):79. doi: 10.1186/s12859-020-3346-8.
6
Identifying Disease-Gene Associations With Graph-Regularized Manifold Learning.利用图正则化流形学习识别疾病-基因关联
Front Genet. 2019 Apr 2;10:270. doi: 10.3389/fgene.2019.00270. eCollection 2019.
7
pBRIT: gene prioritization by correlating functional and phenotypic annotations through integrative data fusion.pBRIT:通过整合数据融合来关联功能和表型注释进行基因优先级排序。
Bioinformatics. 2018 Jul 1;34(13):2254-2262. doi: 10.1093/bioinformatics/bty079.
8
A fast and high performance multiple data integration algorithm for identifying human disease genes.一种用于识别人类疾病基因的快速高效多数据整合算法。
BMC Med Genomics. 2015;8 Suppl 3(Suppl 3):S2. doi: 10.1186/1755-8794-8-S3-S2. Epub 2015 Sep 23.
Identifying protein complexes in protein-protein interaction networks by using clique seeds and graph entropy.
通过使用团块种子和图熵识别蛋白质-蛋白质相互作用网络中的蛋白质复合物。
Proteomics. 2013 Jan;13(2):269-77. doi: 10.1002/pmic.201200336. Epub 2012 Nov 29.
4
Pharmacogenomics knowledge for personalized medicine.药物基因组学知识与个性化医疗。
Clin Pharmacol Ther. 2012 Oct;92(4):414-7. doi: 10.1038/clpt.2012.96.
5
Inferring disease and gene set associations with rank coherence in networks.在网络中通过秩相干性推断疾病和基因集的关联。
Bioinformatics. 2011 Oct 1;27(19):2692-9. doi: 10.1093/bioinformatics/btr463. Epub 2011 Aug 8.
6
In silico gene prioritization by integrating multiple data sources.通过整合多种数据源进行计算基因优先级。
PLoS One. 2011;6(6):e21137. doi: 10.1371/journal.pone.0021137. Epub 2011 Jun 24.
7
Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach.整合多个蛋白质-蛋白质相互作用网络以优先考虑疾病基因:一种贝叶斯回归方法。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S11. doi: 10.1186/1471-2105-12-S1-S11.
8
Prediction of human disease-related gene clusters by clustering analysis.通过聚类分析预测与人类疾病相关的基因簇。
Int J Biol Sci. 2011 Jan 14;7(1):61-73. doi: 10.7150/ijbs.7.61.
9
A global map of human gene expression.一张人类基因表达的全球图谱。
Nat Biotechnol. 2010 Apr;28(4):322-4. doi: 10.1038/nbt0410-322.
10
Bayesian Markov Random Field analysis for protein function prediction based on network data.基于网络数据的蛋白质功能预测的贝叶斯马尔可夫随机场分析。
PLoS One. 2010 Feb 24;5(2):e9293. doi: 10.1371/journal.pone.0009293.