• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

pBRIT:通过整合数据融合来关联功能和表型注释进行基因优先级排序。

pBRIT: gene prioritization by correlating functional and phenotypic annotations through integrative data fusion.

机构信息

Center of Medical Genetics, University of Antwerp and Antwerp University Hospital, Antwerp, Belgium.

Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium.

出版信息

Bioinformatics. 2018 Jul 1;34(13):2254-2262. doi: 10.1093/bioinformatics/bty079.

DOI:10.1093/bioinformatics/bty079
PMID:29452392
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6022555/
Abstract

MOTIVATION

Computational gene prioritization can aid in disease gene identification. Here, we propose pBRIT (prioritization using Bayesian Ridge regression and Information Theoretic model), a novel adaptive and scalable prioritization tool, integrating Pubmed abstracts, Gene Ontology, Sequence similarities, Mammalian and Human Phenotype Ontology, Pathway, Interactions, Disease Ontology, Gene Association database and Human Genome Epidemiology database, into the prediction model. We explore and address effects of sparsity and inter-feature dependencies within annotation sources, and the impact of bias towards specific annotations.

RESULTS

pBRIT models feature dependencies and sparsity by an Information-Theoretic (data driven) approach and applies intermediate integration based data fusion. Following the hypothesis that genes underlying similar diseases will share functional and phenotype characteristics, it incorporates Bayesian Ridge regression to learn a linear mapping between functional and phenotype annotations. Genes are prioritized on phenotypic concordance to the training genes. We evaluated pBRIT against nine existing methods, and on over 2000 HPO-gene associations retrieved after construction of pBRIT data sources. We achieve maximum AUC scores ranging from 0.92 to 0.96 against benchmark datasets and of 0.80 against the time-stamped HPO entries, indicating good performance with high sensitivity and specificity. Our model shows stable performance with regard to changes in the underlying annotation data, is fast and scalable for implementation in routine pipelines.

AVAILABILITY AND IMPLEMENTATION

http://biomina.be/apps/pbrit/; https://bitbucket.org/medgenua/pbrit.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

计算基因优先级可以辅助疾病基因的识别。在这里,我们提出了 pBRIT(基于贝叶斯 Ridge 回归和信息论模型的优先级排序),这是一种新颖的自适应和可扩展的优先级排序工具,将 PubMed 摘要、基因本体、序列相似性、哺乳动物和人类表型本体、途径、相互作用、疾病本体、基因关联数据库和人类基因组流行病学数据库集成到预测模型中。我们探索并解决了注释来源内稀疏性和特征依赖性的影响,以及对特定注释的偏向的影响。

结果

pBRIT 通过信息论(数据驱动)方法对特征依赖性和稀疏性进行建模,并应用基于中间整合的数据融合。基于这样的假设,即具有相似疾病的基因将共享功能和表型特征,它将贝叶斯 Ridge 回归纳入其中,以学习功能和表型注释之间的线性映射。根据与训练基因在表型上的一致性对基因进行优先级排序。我们将 pBRIT 与九种现有方法进行了评估,并在构建 pBRIT 数据源后检索到的 2000 多个 HPO-基因关联中进行了评估。我们针对基准数据集获得了从 0.92 到 0.96 的最大 AUC 分数,针对时间戳 HPO 条目获得了 0.80 的 AUC 分数,表明具有高灵敏度和特异性的良好性能。我们的模型在底层注释数据发生变化时表现出稳定的性能,快速且可扩展,适用于常规管道的实施。

可用性和实现

http://biomina.be/apps/pbrit/; https://bitbucket.org/medgenua/pbrit。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/1e7198a4aab3/bty079f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/289d8bf67abf/bty079f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/b49d00d94240/bty079f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/4acd60db5e4c/bty079f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/e99804e68c78/bty079f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/1e7198a4aab3/bty079f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/289d8bf67abf/bty079f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/b49d00d94240/bty079f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/4acd60db5e4c/bty079f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/e99804e68c78/bty079f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f7c8/6022555/1e7198a4aab3/bty079f5.jpg

相似文献

1
pBRIT: gene prioritization by correlating functional and phenotypic annotations through integrative data fusion.pBRIT:通过整合数据融合来关联功能和表型注释进行基因优先级排序。
Bioinformatics. 2018 Jul 1;34(13):2254-2262. doi: 10.1093/bioinformatics/bty079.
2
Gene prioritization using Bayesian matrix factorization with genomic and phenotypic side information.基于基因组和表型侧信息的贝叶斯矩阵分解基因优先级排序。
Bioinformatics. 2018 Jul 1;34(13):i447-i456. doi: 10.1093/bioinformatics/bty289.
3
OVA: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization.OVA:整合来自多个生物医学领域本体的分子和物理表型数据,并进行变异过滤以增强变异优先级排序。
Bioinformatics. 2015 Dec 1;31(23):3822-9. doi: 10.1093/bioinformatics/btv473. Epub 2015 Aug 12.
4
The Human Phenotype Ontology in 2017.2017年的人类表型本体论。
Nucleic Acids Res. 2017 Jan 4;45(D1):D865-D876. doi: 10.1093/nar/gkw1039. Epub 2016 Nov 28.
5
Transfer learning across ontologies for phenome-genome association prediction.跨本体的迁移学习用于表型-基因组关联预测。
Bioinformatics. 2017 Feb 15;33(4):529-536. doi: 10.1093/bioinformatics/btw649.
6
InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk.InfAcrOnt:使用信息流动的随机游走计算跨本体术语相似度。
BMC Genomics. 2018 Jan 19;19(Suppl 1):919. doi: 10.1186/s12864-017-4338-6.
7
Information-theoretic evaluation of predicted ontological annotations.基于信息论的预测本体论注释评估。
Bioinformatics. 2013 Jul 1;29(13):i53-61. doi: 10.1093/bioinformatics/btt228.
8
MCO: towards an ontology and unified vocabulary for a framework-based annotation of microbial growth conditions.MCO:一种基于框架的微生物生长条件标注的本体论和统一词汇表。
Bioinformatics. 2019 Mar 1;35(5):856-864. doi: 10.1093/bioinformatics/bty689.
9
The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data.人类表型本体论项目:通过表型数据将分子生物学和疾病联系起来。
Nucleic Acids Res. 2014 Jan;42(Database issue):D966-74. doi: 10.1093/nar/gkt1026. Epub 2013 Nov 11.
10
Cross-organism learning method to discover new gene functionalities.跨生物学习方法发现新基因功能。
Comput Methods Programs Biomed. 2016 Apr;126:20-34. doi: 10.1016/j.cmpb.2015.12.002. Epub 2015 Dec 17.

引用本文的文献

1
Proteomizer: Leveraging the Transcriptome-Proteome Mismatch to Infer Novel Gene Regulatory Relations.蛋白质组生成器:利用转录组与蛋白质组的不匹配来推断新型基因调控关系。
bioRxiv. 2025 Jun 27:2025.06.22.660946. doi: 10.1101/2025.06.22.660946.
2
Single-cell data combined with phenotypes improves variant interpretation.单细胞数据与表型相结合可改善变异解读。
BMC Genomics. 2025 May 28;26(1):540. doi: 10.1186/s12864-025-11711-w.
3
Tissue-aware interpretation of genetic variants advances the etiology of rare diseases.组织感知遗传变异解释推进罕见病病因学研究。

本文引用的文献

1
Random walk with restart on multiplex and heterogeneous biological networks.重连随机游走在多重和异质生物网络中的应用。
Bioinformatics. 2019 Feb 1;35(3):497-505. doi: 10.1093/bioinformatics/bty637.
2
An expanded evaluation of protein function prediction methods shows an improvement in accuracy.对蛋白质功能预测方法的扩展评估显示准确性有所提高。
Genome Biol. 2016 Sep 7;17(1):184. doi: 10.1186/s13059-016-1037-6.
3
Impact of outdated gene annotations on pathway enrichment analysis.过时的基因注释对通路富集分析的影响。
Mol Syst Biol. 2024 Nov;20(11):1187-1206. doi: 10.1038/s44320-024-00061-6. Epub 2024 Sep 16.
4
Gollop-Wolfgang Complex Is Associated with a Monoallelic Variation in .戈洛普-沃尔夫冈复合体与……中的单等位基因变异相关。
Genes (Basel). 2024 Jan 20;15(1):129. doi: 10.3390/genes15010129.
5
Simulation of undiagnosed patients with novel genetic conditions.模拟患有新型遗传疾病的未确诊患者。
Nat Commun. 2023 Oct 12;14(1):6403. doi: 10.1038/s41467-023-41980-6.
6
DeepGenePrior: A deep learning model for prioritizing genes affected by copy number variants.深度基因优先级:一种用于优先考虑受拷贝数变异影响的基因的深度学习模型。
PLoS Comput Biol. 2023 Jul 24;19(7):e1011249. doi: 10.1371/journal.pcbi.1011249. eCollection 2023 Jul.
7
Predicting molecular mechanisms of hereditary diseases by using their tissue-selective manifestation.利用组织选择性表现预测遗传性疾病的分子机制。
Mol Syst Biol. 2023 Aug 8;19(8):e11407. doi: 10.15252/msb.202211407. Epub 2023 May 26.
8
Angiogenesis goes computational - The future way forward to discover new angiogenic targets?血管生成进入计算时代——发现新的血管生成靶点的未来之路?
Comput Struct Biotechnol J. 2022 Sep 13;20:5235-5255. doi: 10.1016/j.csbj.2022.09.019. eCollection 2022.
9
Network-Based Approaches for Disease-Gene Association Prediction Using Protein-Protein Interaction Networks.基于蛋白质-蛋白质相互作用网络的疾病-基因关联预测的网络方法。
Int J Mol Sci. 2022 Jul 3;23(13):7411. doi: 10.3390/ijms23137411.
10
Prioritizing Suggestive Candidate Genes in Migraine: An Opinion.偏头痛中提示性候选基因的优先级排序:一种观点。
Front Neurol. 2022 Jun 15;13:910366. doi: 10.3389/fneur.2022.910366. eCollection 2022.
Nat Methods. 2016 Aug 30;13(9):705-6. doi: 10.1038/nmeth.3963.
4
Analysis of protein-coding genetic variation in 60,706 humans.对60706名人类的蛋白质编码基因变异进行分析。
Nature. 2016 Aug 18;536(7616):285-91. doi: 10.1038/nature19057.
5
Candidate gene prioritization with Endeavour.使用Endeavour进行候选基因优先级排序。
Nucleic Acids Res. 2016 Jul 8;44(W1):W117-21. doi: 10.1093/nar/gkw365. Epub 2016 Apr 30.
6
Gene Prioritization by Compressive Data Fusion and Chaining.通过压缩数据融合和链接进行基因优先级排序。
PLoS Comput Biol. 2015 Oct 14;11(10):e1004552. doi: 10.1371/journal.pcbi.1004552. eCollection 2015 Oct.
7
A fast and high performance multiple data integration algorithm for identifying human disease genes.一种用于识别人类疾病基因的快速高效多数据整合算法。
BMC Med Genomics. 2015;8 Suppl 3(Suppl 3):S2. doi: 10.1186/1755-8794-8-S3-S2. Epub 2015 Sep 23.
8
DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes.DisGeNET:一个用于动态探索人类疾病及其基因的发现平台。
Database (Oxford). 2015 Apr 15;2015:bav028. doi: 10.1093/database/bav028. Print 2015.
9
De novo loss- or gain-of-function mutations in KCNA2 cause epileptic encephalopathy.KCNA2基因的从头功能丧失或功能获得性突变会导致癫痫性脑病。
Nat Genet. 2015 Apr;47(4):393-399. doi: 10.1038/ng.3239. Epub 2015 Mar 9.
10
HyDRA: gene prioritization via hybrid distance-score rank aggregation.HyDRA:通过混合距离分数排名聚合进行基因优先级排序。
Bioinformatics. 2015 Apr 1;31(7):1034-43. doi: 10.1093/bioinformatics/btu766. Epub 2014 Nov 18.