Vashisht Rohit, Bhardwaj Anshu, Brahmachari Samir K
CSIR-Open Source Drug Discovery Unit, New Delhi, India.
Mol Biosyst. 2013 Jul;9(7):1584-93. doi: 10.1039/c3mb25546h. Epub 2013 Apr 29.
Contextualizing relevant information to construct a network that represents a given biological process presents a fundamental challenge in the network science of biology. The quality of network for the organism of interest is critically dependent on the extent of functional annotation of its genome. Mostly the automated annotation pipelines do not account for unstructured information present in volumes of literature and hence large fraction of genome remains poorly annotated. However, if used, this information could substantially enhance the functional annotation of a genome, aiding the development of a more comprehensive network. Mining unstructured information buried in volumes of literature often requires manual intervention to a great extent and thus becomes a bottleneck for most of the automated pipelines. In this review, we discuss the potential of scientific social networking as a solution for systematic manual mining of data. Focusing on Mycobacterium tuberculosis, as a case study, we discuss our open innovative approach for the functional annotation of its genome. Furthermore, we highlight the strength of such collated structured data in the context of drug target prediction based on systems level analysis of pathogen.
将相关信息置于背景中以构建代表特定生物过程的网络,这在生物网络科学中是一项根本性挑战。针对目标生物体的网络质量严重依赖于其基因组的功能注释程度。大多数自动化注释流程并未考虑大量文献中存在的非结构化信息,因此很大一部分基因组的注释仍然很差。然而,如果利用这些信息,就可以大幅增强基因组的功能注释,有助于构建更全面的网络。挖掘大量文献中埋藏的非结构化信息通常在很大程度上需要人工干预,因此成为大多数自动化流程的瓶颈。在本综述中,我们讨论科学社交网络作为系统人工挖掘数据解决方案的潜力。以结核分枝杆菌为例,我们讨论了对其基因组进行功能注释的开放式创新方法。此外,基于病原体的系统水平分析,我们强调了此类整理后的结构化数据在药物靶点预测方面的优势。