Suppr超能文献

综合多种来源的偏向性基因-组织关系证据。

Combining evidence of preferential gene-tissue relationships from multiple sources.

机构信息

Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden.

出版信息

PLoS One. 2013 Aug 12;8(8):e70568. doi: 10.1371/journal.pone.0070568. eCollection 2013.

Abstract

An important challenge in drug discovery and disease prognosis is to predict genes that are preferentially expressed in one or a few tissues, i.e. showing a considerably higher expression in one tissue(s) compared to the others. Although several data sources and methods have been published explicitly for this purpose, they often disagree and it is not evident how to retrieve these genes and how to distinguish true biological findings from those that are due to choice-of-method and/or experimental settings. In this work we have developed a computational approach that combines results from multiple methods and datasets with the aim to eliminate method/study-specific biases and to improve the predictability of preferentially expressed human genes. A rule-based score is used to merge and assign support to the results. Five sets of genes with known tissue specificity were used for parameter pruning and cross-validation. In total we identify 3434 tissue-specific genes. We compare the genes of highest scores with the public databases: PaGenBase (microarray), TiGER (EST) and HPA (protein expression data). The results have 85% overlap to PaGenBase, 71% to TiGER and only 28% to HPA. 99% of our predictions have support from at least one of these databases. Our approach also performs better than any of the databases on identifying drug targets and biomarkers with known tissue-specificity.

摘要

在药物发现和疾病预测中,一个重要的挑战是预测那些在一个或几个组织中优先表达的基因,即与其他组织相比,在一个组织(或多个组织)中表达水平显著更高的基因。尽管已经有几种专门为此目的发布的数据来源和方法,但它们往往存在差异,并且不清楚如何检索这些基因,以及如何区分真正的生物学发现与由于方法选择和/或实验设置而产生的发现。在这项工作中,我们开发了一种计算方法,该方法结合了来自多种方法和数据集的结果,旨在消除方法/研究特异性偏差,并提高优先表达人类基因的可预测性。使用基于规则的分数来合并和分配结果的支持。使用五组具有已知组织特异性的基因进行参数修剪和交叉验证。总共确定了 3434 个组织特异性基因。我们将得分最高的基因与公共数据库进行比较:PaGenBase(微阵列)、TiGER(EST)和 HPA(蛋白质表达数据)。结果与 PaGenBase 的重叠度为 85%,与 TiGER 的重叠度为 71%,与 HPA 的重叠度仅为 28%。我们的预测有 99%得到了这些数据库中的至少一个的支持。我们的方法在识别具有已知组织特异性的药物靶点和生物标志物方面也优于任何一个数据库。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f96/3741196/854f27dc2f23/pone.0070568.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验