• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DomainRBF:一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.

作者信息

Zhang Wangshu, Chen Yong, Sun Fengzhu, Jiang Rui

机构信息

MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing, China.

出版信息

BMC Syst Biol. 2011 Apr 19;5:55. doi: 10.1186/1752-0509-5-55.

DOI:10.1186/1752-0509-5-55
PMID:21504591
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3108930/
Abstract

BACKGROUND

Domains are basic units of proteins, and thus exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases. Within a given domain-domain interaction network, we make the assumption that similarities of disease phenotypes can be explained using proximities of domains associated with such diseases. Based on this assumption, we propose a Bayesian regression approach named "domainRBF" (domain Rank with Bayes Factor) to prioritize candidate domains for human complex diseases.

RESULTS

Using a compiled dataset containing 1,614 associations between 671 domains and 1,145 disease phenotypes, we demonstrate the effectiveness of the proposed approach through three large-scale leave-one-out cross-validation experiments (random control, simulated linkage interval, and genome-wide scan), and we do so in terms of three criteria (precision, mean rank ratio, and AUC score). We further show that the proposed approach is robust to the parameters involved and the underlying domain-domain interaction network through a series of permutation tests. Once having assessed the validity of this approach, we show the possibility of ab initio inference of domain-disease associations and gene-disease associations, and we illustrate the strong agreement between our inferences and the evidences from genome-wide association studies for four common diseases (type 1 diabetes, type 2 diabetes, Crohn's disease, and breast cancer). Finally, we provide a pre-calculated genome-wide landscape of associations between 5,490 protein domains and 5,080 human diseases and offer free access to this resource.

CONCLUSIONS

The proposed approach effectively ranks susceptible domains among the top of the candidates, and it is robust to the parameters involved. The ab initio inference of domain-disease associations shows strong agreement with the evidence provided by genome-wide association studies. The predicted landscape provides a comprehensive understanding of associations between domains and human diseases.

摘要

背景

结构域是蛋白质的基本单位,因此探索蛋白质结构域与人类遗传性疾病之间的关联将极大地增进我们对人类复杂疾病发病机制的理解,并进一步有益于这些疾病的医学预防、诊断和治疗。在给定的结构域-结构域相互作用网络中,我们假设疾病表型的相似性可以用与此类疾病相关的结构域的接近程度来解释。基于这一假设,我们提出了一种名为“domainRBF”(基于贝叶斯因子的结构域排名)的贝叶斯回归方法,用于对人类复杂疾病的候选结构域进行优先级排序。

结果

使用一个包含671个结构域与1145种疾病表型之间1614个关联的汇编数据集,我们通过三个大规模的留一法交叉验证实验(随机对照、模拟连锁区间和全基因组扫描),并依据三个标准(精确率、平均排名比和AUC分数)证明了所提方法的有效性。我们还通过一系列置换检验进一步表明,所提方法对涉及的参数和潜在的结构域-结构域相互作用网络具有鲁棒性。在评估了该方法的有效性之后,我们展示了从头推断结构域-疾病关联和基因-疾病关联的可能性,并说明了我们的推断与四种常见疾病(1型糖尿病、2型糖尿病、克罗恩病和乳腺癌)的全基因组关联研究证据之间的高度一致性。最后,我们提供了一个预先计算的全基因组范围内5490个蛋白质结构域与5080种人类疾病之间关联的图谱,并提供对该资源的免费访问。

结论

所提方法有效地将易感结构域排在候选结构域的前列,并且对涉及的参数具有鲁棒性。结构域-疾病关联的从头推断与全基因组关联研究所提供的证据高度一致。预测的图谱提供了对结构域与人类疾病之间关联的全面理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/e91233a0f1fe/1752-0509-5-55-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/8866389618ac/1752-0509-5-55-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/0dcc1b701651/1752-0509-5-55-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/b1950bf38c9f/1752-0509-5-55-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/07d0f20273b6/1752-0509-5-55-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/c298924273c3/1752-0509-5-55-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/2831f9ca8951/1752-0509-5-55-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/a0709e09f6dc/1752-0509-5-55-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/e91233a0f1fe/1752-0509-5-55-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/8866389618ac/1752-0509-5-55-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/0dcc1b701651/1752-0509-5-55-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/b1950bf38c9f/1752-0509-5-55-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/07d0f20273b6/1752-0509-5-55-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/c298924273c3/1752-0509-5-55-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/2831f9ca8951/1752-0509-5-55-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/a0709e09f6dc/1752-0509-5-55-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dbe/3108930/e91233a0f1fe/1752-0509-5-55-8.jpg

相似文献

1
DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.DomainRBF:一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。
BMC Syst Biol. 2011 Apr 19;5:55. doi: 10.1186/1752-0509-5-55.
2
Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships.从结构域-蛋白质、蛋白质-疾病和疾病-疾病关系推断结构域-疾病关联
BMC Syst Biol. 2016 Jan 11;10 Suppl 1(Suppl 1):4. doi: 10.1186/s12918-015-0247-y.
3
Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach.整合多个蛋白质-蛋白质相互作用网络以优先考虑疾病基因:一种贝叶斯回归方法。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S11. doi: 10.1186/1471-2105-12-S1-S11.
4
Prioritisation of associations between protein domains and complex diseases using domain-domain interaction networks.利用蛋白质结构域相互作用网络对蛋白质结构域与复杂疾病之间的关联进行优先级排序。
IET Syst Biol. 2010 May;4(3):212-22. doi: 10.1049/iet-syb.2009.0037.
5
Prioritizing candidate disease genes by network-based boosting of genome-wide association data.基于全基因组关联数据的网络增强优先考虑候选疾病基因。
Genome Res. 2011 Jul;21(7):1109-21. doi: 10.1101/gr.118992.110. Epub 2011 May 2.
6
Integrating human omics data to prioritize candidate genes.整合人类组学数据,优先考虑候选基因。
BMC Med Genomics. 2013 Dec 18;6:57. doi: 10.1186/1755-8794-6-57.
7
Predicting disease-related subnetworks for type 1 diabetes using a new network activity score.使用新的网络活动评分预测 1 型糖尿病相关子网络。
OMICS. 2012 Oct;16(10):566-78. doi: 10.1089/omi.2012.0029. Epub 2012 Aug 23.
8
LPG: A four-group probabilistic approach to leveraging pleiotropy in genome-wide association studies.LPG:一种在全基因组关联研究中利用多效性的四组概率方法。
BMC Genomics. 2018 Jun 28;19(1):503. doi: 10.1186/s12864-018-4851-2.
9
Prioritizing protein complexes implicated in human diseases by network optimization.通过网络优化对与人类疾病相关的蛋白质复合物进行优先级排序。
BMC Syst Biol. 2014;8 Suppl 1(Suppl 1):S2. doi: 10.1186/1752-0509-8-S1-S2. Epub 2014 Jan 24.
10
Exploiting protein-protein interaction networks for genome-wide disease-gene prioritization.利用蛋白质-蛋白质相互作用网络进行全基因组疾病基因优先级排序。
PLoS One. 2012;7(9):e43557. doi: 10.1371/journal.pone.0043557. Epub 2012 Sep 21.

引用本文的文献

1
Protein structural domain-disease association prediction based on heterogeneous networks.基于异质网络的蛋白质结构域-疾病关联预测
BMC Genomics. 2025 Apr 10;23(Suppl 6):869. doi: 10.1186/s12864-024-11117-0.
2
Mapping OMIM Disease-Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes.将《在线人类孟德尔遗传》(OMIM)疾病相关变异映射到蛋白质结构域上揭示了变异类型、蛋白质家族(Pfam)模型和疾病类别之间的关联。
Front Mol Biosci. 2021 May 7;8:617016. doi: 10.3389/fmolb.2021.617016. eCollection 2021.
3
pBRIT: gene prioritization by correlating functional and phenotypic annotations through integrative data fusion.

本文引用的文献

1
Prioritisation of associations between protein domains and complex diseases using domain-domain interaction networks.利用蛋白质结构域相互作用网络对蛋白质结构域与复杂疾病之间的关联进行优先级排序。
IET Syst Biol. 2010 May;4(3):212-22. doi: 10.1049/iet-syb.2009.0037.
2
The pursuit of genome-wide association studies: where are we now?全基因组关联研究的探索:我们现在在哪里?
J Hum Genet. 2010 Apr;55(4):195-206. doi: 10.1038/jhg.2010.19. Epub 2010 Mar 19.
3
Gene prioritization in Type 2 Diabetes using domain interactions and network analysis.
pBRIT:通过整合数据融合来关联功能和表型注释进行基因优先级排序。
Bioinformatics. 2018 Jul 1;34(13):2254-2262. doi: 10.1093/bioinformatics/bty079.
4
Clinical relevance of epigenetics in the onset and management of type 2 diabetes mellitus.表观遗传学在2型糖尿病发病及管理中的临床相关性。
Epigenetics. 2017 Jun 3;12(6):401-415. doi: 10.1080/15592294.2016.1278097. Epub 2017 Jan 6.
5
Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships.从结构域-蛋白质、蛋白质-疾病和疾病-疾病关系推断结构域-疾病关联
BMC Syst Biol. 2016 Jan 11;10 Suppl 1(Suppl 1):4. doi: 10.1186/s12918-015-0247-y.
6
Prioritizing protein complexes implicated in human diseases by network optimization.通过网络优化对与人类疾病相关的蛋白质复合物进行优先级排序。
BMC Syst Biol. 2014;8 Suppl 1(Suppl 1):S2. doi: 10.1186/1752-0509-8-S1-S2. Epub 2014 Jan 24.
7
ProphNet: a generic prioritization method through propagation of information.ProphNet:一种通过信息传播进行通用优先级排序的方法。
BMC Bioinformatics. 2014;15 Suppl 1(Suppl 1):S5. doi: 10.1186/1471-2105-15-S1-S5. Epub 2014 Jan 10.
8
Integrating human omics data to prioritize candidate genes.整合人类组学数据,优先考虑候选基因。
BMC Med Genomics. 2013 Dec 18;6:57. doi: 10.1186/1755-8794-6-57.
9
Genetic association studies in lumbar disc degeneration: a systematic review.腰椎间盘退变的遗传关联研究:系统综述。
PLoS One. 2012;7(11):e49995. doi: 10.1371/journal.pone.0049995. Epub 2012 Nov 21.
10
Bioinformatics for personal genome interpretation.个人基因组解读的生物信息学。
Brief Bioinform. 2012 Jul;13(4):495-512. doi: 10.1093/bib/bbr070. Epub 2012 Jan 13.
使用域交互和网络分析进行 2 型糖尿病的基因优先级排序。
BMC Genomics. 2010 Feb 2;11:84. doi: 10.1186/1471-2164-11-84.
4
Tests for candidate-gene interaction for longitudinal quantitative traits measured in a large cohort.针对在一个大型队列中测量的纵向定量性状的候选基因相互作用的测试。
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S80. doi: 10.1186/1753-6561-3-s7-s80.
5
The Pfam protein families database.Pfam 蛋白质家族数据库。
Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.
6
The Universal Protein Resource (UniProt) in 2010.2010 年的通用蛋白质资源(UniProt)。
Nucleic Acids Res. 2010 Jan;38(Database issue):D142-8. doi: 10.1093/nar/gkp846. Epub 2009 Oct 20.
7
ToppGene Suite for gene list enrichment analysis and candidate gene prioritization.用于基因列表富集分析和候选基因优先级排序的ToppGene Suite。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W305-11. doi: 10.1093/nar/gkp427. Epub 2009 May 22.
8
Infrastructure for the life sciences: design and implementation of the UniProt website.生命科学基础设施:UniProt网站的设计与实现
BMC Bioinformatics. 2009 May 8;10:136. doi: 10.1186/1471-2105-10-136.
9
BioMart Central Portal--unified access to biological data.生物信息中心门户——生物数据的统一访问入口。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W23-7. doi: 10.1093/nar/gkp265. Epub 2009 May 6.
10
The need for genetic variant naming standards in published abstracts of human genetic association studies.人类基因关联研究已发表摘要中基因变异命名标准的必要性。
BMC Res Notes. 2009 Apr 14;2:56. doi: 10.1186/1756-0500-2-56.