DomainRBF：一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.

作者信息

Zhang Wangshu, Chen Yong, Sun Fengzhu, Jiang Rui

机构信息

MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing, China.

出版信息

BMC Syst Biol. 2011 Apr 19;5:55. doi: 10.1186/1752-0509-5-55.

DOI:10.1186/1752-0509-5-55

PMID:21504591

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3108930/

Abstract

BACKGROUND

Domains are basic units of proteins, and thus exploring associations between protein domains and human inherited diseases will greatly improve our understanding of the pathogenesis of human complex diseases and further benefit the medical prevention, diagnosis and treatment of these diseases. Within a given domain-domain interaction network, we make the assumption that similarities of disease phenotypes can be explained using proximities of domains associated with such diseases. Based on this assumption, we propose a Bayesian regression approach named "domainRBF" (domain Rank with Bayes Factor) to prioritize candidate domains for human complex diseases.

RESULTS

Using a compiled dataset containing 1,614 associations between 671 domains and 1,145 disease phenotypes, we demonstrate the effectiveness of the proposed approach through three large-scale leave-one-out cross-validation experiments (random control, simulated linkage interval, and genome-wide scan), and we do so in terms of three criteria (precision, mean rank ratio, and AUC score). We further show that the proposed approach is robust to the parameters involved and the underlying domain-domain interaction network through a series of permutation tests. Once having assessed the validity of this approach, we show the possibility of ab initio inference of domain-disease associations and gene-disease associations, and we illustrate the strong agreement between our inferences and the evidences from genome-wide association studies for four common diseases (type 1 diabetes, type 2 diabetes, Crohn's disease, and breast cancer). Finally, we provide a pre-calculated genome-wide landscape of associations between 5,490 protein domains and 5,080 human diseases and offer free access to this resource.

CONCLUSIONS

The proposed approach effectively ranks susceptible domains among the top of the candidates, and it is robust to the parameters involved. The ab initio inference of domain-disease associations shows strong agreement with the evidence provided by genome-wide association studies. The predicted landscape provides a comprehensive understanding of associations between domains and human diseases.

摘要

背景

结构域是蛋白质的基本单位，因此探索蛋白质结构域与人类遗传性疾病之间的关联将极大地增进我们对人类复杂疾病发病机制的理解，并进一步有益于这些疾病的医学预防、诊断和治疗。在给定的结构域-结构域相互作用网络中，我们假设疾病表型的相似性可以用与此类疾病相关的结构域的接近程度来解释。基于这一假设，我们提出了一种名为“domainRBF”（基于贝叶斯因子的结构域排名）的贝叶斯回归方法，用于对人类复杂疾病的候选结构域进行优先级排序。

结果

使用一个包含671个结构域与1145种疾病表型之间1614个关联的汇编数据集，我们通过三个大规模的留一法交叉验证实验（随机对照、模拟连锁区间和全基因组扫描），并依据三个标准（精确率、平均排名比和AUC分数）证明了所提方法的有效性。我们还通过一系列置换检验进一步表明，所提方法对涉及的参数和潜在的结构域-结构域相互作用网络具有鲁棒性。在评估了该方法的有效性之后，我们展示了从头推断结构域-疾病关联和基因-疾病关联的可能性，并说明了我们的推断与四种常见疾病（1型糖尿病、2型糖尿病、克罗恩病和乳腺癌）的全基因组关联研究证据之间的高度一致性。最后，我们提供了一个预先计算的全基因组范围内5490个蛋白质结构域与5080种人类疾病之间关联的图谱，并提供对该资源的免费访问。

结论

所提方法有效地将易感结构域排在候选结构域的前列，并且对涉及的参数具有鲁棒性。结构域-疾病关联的从头推断与全基因组关联研究所提供的证据高度一致。预测的图谱提供了对结构域与人类疾病之间关联的全面理解。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

DomainRBF：一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

DomainRBF：一种用于复杂疾病候选结构域优先级排序的贝叶斯回归方法。

DomainRBF: a Bayesian regression approach to the prioritization of candidate domains for complex diseases.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献