University ‘Magna Gracia’ of Catanzaro, Italy.
Brief Bioinform. 2012 Sep;13(5):569-85. doi: 10.1093/bib/bbr066. Epub 2011 Dec 2.
The integration of proteomics data with biological knowledge is a recent trend in bioinformatics. A lot of biological information is available and is spread on different sources and encoded in different ontologies (e.g. Gene Ontology). Annotating existing protein data with biological information may enable the use (and the development) of algorithms that use biological ontologies as framework to mine annotated data. Recently many methodologies and algorithms that use ontologies to extract knowledge from data, as well as to analyse ontologies themselves have been proposed and applied to other fields. Conversely, the use of such annotations for the analysis of protein data is a relatively novel research area that is currently becoming more and more central in research. Existing approaches span from the definition of the similarity among genes and proteins on the basis of the annotating terms, to the definition of novel algorithms that use such similarities for mining protein data on a proteome-wide scale. This work, after the definition of main concept of such analysis, presents a systematic discussion and comparison of main approaches. Finally, remaining challenges, as well as possible future directions of research are presented.
蛋白质组学数据与生物知识的整合是生物信息学的一个新趋势。大量的生物信息是可用的,分布在不同的来源,并以不同的本体(例如,基因本体论)进行编码。用生物信息对现有蛋白质数据进行注释,可以使使用(和开发)使用生物本体作为框架来挖掘注释数据的算法成为可能。最近,许多使用本体从数据中提取知识以及分析本体本身的方法和算法已经被提出并应用于其他领域。相反,这种注释在蛋白质数据分析中的应用是一个相对较新的研究领域,目前在研究中越来越重要。现有的方法从基于注释术语的基因和蛋白质之间的相似性定义,到定义使用这种相似性在全蛋白质组范围内挖掘蛋白质数据的新算法都有涉及。这项工作在定义了这种分析的主要概念之后,对主要方法进行了系统的讨论和比较。最后,还提出了剩余的挑战以及可能的未来研究方向。