Suppr超能文献

GIFtS:利用 GeneCards 进行注释景观分析。

GIFtS: annotation landscape analysis with GeneCards.

机构信息

Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.

出版信息

BMC Bioinformatics. 2009 Oct 23;10:348. doi: 10.1186/1471-2105-10-348.

Abstract

BACKGROUND

Gene annotation is a pivotal component in computational genomics, encompassing prediction of gene function, expression analysis, and sequence scrutiny. Hence, quantitative measures of the annotation landscape constitute a pertinent bioinformatics tool. GeneCards is a gene-centric compendium of rich annotative information for over 50,000 human gene entries, building upon 68 data sources, including Gene Ontology (GO), pathways, interactions, phenotypes, publications and many more.

RESULTS

We present the GeneCards Inferred Functionality Score (GIFtS) which allows a quantitative assessment of a gene's annotation status, by exploiting the unique wealth and diversity of GeneCards information. The GIFtS tool, linked from the GeneCards home page, facilitates browsing the human genome by searching for the annotation level of a specified gene, retrieving a list of genes within a specified range of GIFtS value, obtaining random genes with a specific GIFtS value, and experimenting with the GIFtS weighting algorithm for a variety of annotation categories. The bimodal shape of the GIFtS distribution suggests a division of the human gene repertoire into two main groups: the high-GIFtS peak consists almost entirely of protein-coding genes; the low-GIFtS peak consists of genes from all of the categories. Cluster analysis of GIFtS annotation vectors provides the classification of gene groups by detailed positioning in the annotation arena. GIFtS also provide measures which enable the evaluation of the databases that serve as GeneCards sources. An inverse correlation is found (for GIFtS>25) between the number of genes annotated by each source, and the average GIFtS value of genes associated with that source. Three typical source prototypes are revealed by their GIFtS distribution: genome-wide sources, sources comprising mainly highly annotated genes, and sources comprising mainly poorly annotated genes. The degree of accumulated knowledge for a given gene measured by GIFtS was correlated (for GIFtS>30) with the number of publications for a gene, and with the seniority of this entry in the HGNC database.

CONCLUSION

GIFtS can be a valuable tool for computational procedures which analyze lists of large set of genes resulting from wet-lab or computational research. GIFtS may also assist the scientific community with identification of groups of uncharacterized genes for diverse applications, such as delineation of novel functions and charting unexplored areas of the human genome.

摘要

背景

基因注释是计算基因组学的关键组成部分,包括预测基因功能、表达分析和序列分析。因此,注释景观的定量度量是一种相关的生物信息学工具。GeneCards 是一个以基因为中心的综合资源,包含超过 50,000 个人类基因条目丰富的注释信息,基于 68 个数据源,包括基因本体论 (GO)、途径、相互作用、表型、出版物等。

结果

我们提出了 GeneCards 推断功能评分 (GIFtS),通过利用 GeneCards 信息的独特丰富性和多样性,可以对基因的注释状态进行定量评估。GIFtS 工具链接到 GeneCards 主页,通过搜索指定基因的注释水平,可以方便地浏览人类基因组,检索特定 GIFtS 值范围内的基因列表,获取特定 GIFtS 值的随机基因,并针对各种注释类别进行 GIFtS 加权算法的实验。GIFtS 分布的双峰形状表明,人类基因库可以分为两个主要组:高 GIFtS 峰几乎完全由编码蛋白质的基因组成;低 GIFtS 峰由所有类别的基因组成。通过 GIFtS 注释向量的聚类分析,可以对基因组进行详细定位的分类。GIFtS 还提供了评估作为 GeneCards 来源的数据库的度量标准。发现(对于 GIFtS>25)每个来源注释的基因数量与与该来源相关的基因的平均 GIFtS 值之间存在负相关关系。通过其 GIFtS 分布揭示了三种典型的源原型:全基因组源、主要包含高度注释基因的源以及主要包含低度注释基因的源。通过 GIFtS 测量的给定基因的累积知识程度(对于 GIFtS>30)与基因的出版物数量以及 HGNC 数据库中该条目的年龄相关。

结论

GIFtS 可以成为分析湿实验室或计算研究产生的大量基因列表的计算过程的有价值的工具。GIFtS 还可以帮助科学界识别各种应用的未表征基因组,例如描绘新功能和绘制人类基因组的未探索区域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbd6/2774327/8afdf2d13946/1471-2105-10-348-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验