Suppr超能文献

基因本体论在基因识别中的应用。

Application of gene ontology to gene identification.

作者信息

Bastos Hugo P, Tavares Bruno, Pesquita Catia, Faria Daniel, Couto Francisco M

机构信息

Department of Informatics, Faculty of Sciences, University of Lisbon, Lisbon, Portugal.

出版信息

Methods Mol Biol. 2011;760:141-57. doi: 10.1007/978-1-61779-176-5_9.

Abstract

Candidate gene identification deals with associating genes to underlying biological phenomena, such as diseases and specific disorders. It has been shown that classes of diseases with similar phenotypes are caused by functionally related genes. Currently, a fair amount of knowledge about the functional characterization can be found across several public databases; however, functional descriptors can be ambiguous, domain specific, and context dependent. In order to cope with these issues, the Gene Ontology (GO) project developed a bio-ontology of broad scope and wide applicability. Thus, the structured and controlled vocabulary of terms provided by the GO project describing the biological roles of gene products can be very helpful in candidate gene identification approaches. The method presented here uses GO annotation data in order to identify the most meaningful functional aspects occurring in a given set of related gene products. The method measures this meaningfulness by calculating an e-value based on the frequency of annotation of each GO term in the set of gene products versus the total frequency of annotation. Then after selecting a GO term related to the underlying biological phenomena being studied, the method uses semantic similarity to rank the given gene products that are annotated to the term. This enables the user to further narrow down the list of gene products and identify those that are more likely of interest.

摘要

候选基因鉴定涉及将基因与潜在的生物学现象(如疾病和特定病症)相关联。研究表明,具有相似表型的疾病类别是由功能相关的基因引起的。目前,在多个公共数据库中可以找到相当数量关于功能特征的知识;然而,功能描述符可能具有模糊性、领域特异性且依赖于上下文。为了解决这些问题,基因本体论(GO)项目开发了一种具有广泛范围和广泛适用性的生物本体。因此,GO项目提供的用于描述基因产物生物学作用的结构化和受控术语词汇表在候选基因鉴定方法中非常有帮助。这里提出的方法使用GO注释数据来识别给定一组相关基因产物中出现的最有意义的功能方面。该方法通过基于基因产物集合中每个GO术语的注释频率与总注释频率计算e值来衡量这种意义。然后,在选择与正在研究的潜在生物学现象相关的GO术语之后,该方法使用语义相似性对注释到该术语的给定基因产物进行排名。这使得用户能够进一步缩小基因产物列表,并识别出那些更有可能感兴趣的基因产物。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验