Huang Zhong-Xi, Tian Hui-Yong, Hu Zhen-Fu, Zhou Yi-Bo, Zhao Jin, Yao Kai-Tai
Cancer Institute, Southern Medical University, Guangzhou, 510515, PR China.
BMC Bioinformatics. 2008 Jul 13;9:308. doi: 10.1186/1471-2105-9-308.
Biomedical researchers often want to explore pathogenesis and pathways regulated by abnormally expressed genes, such as those identified by microarray analyses. Literature mining is an important way to assist in this task. Many literature mining tools are now available. However, few of them allows the user to make manual adjustments to zero in on what he/she wants to know in particular.
We present our software program, GenCLiP (Gene Cluster with Literature Profiles), which is based on the methods presented by Chaussabel and Sher (Genome Biol 2002, 3(10):RESEARCH0055) that search gene lists to identify functional clusters of genes based on up-to-date literature profiling. Four features were added to this previously described method: the ability to 1) manually curate keywords extracted from the literature, 2) search genes and gene co-occurrence networks related to custom keywords, 3) compare analyzed gene results with negative and positive controls generated by GenCLiP, and 4) calculate probabilities that the resulting genes and gene networks are randomly related. In this paper, we show with a set of differentially expressed genes between keloids and normal control, how implementation of functions in GenCLiP successfully identified keywords related to the pathogenesis of keloids and unknown gene pathways involved in the pathogenesis of keloids.
With regard to the identification of disease-susceptibility genes, GenCLiP allows one to quickly acquire a primary pathogenesis profile and identify pathways involving abnormally expressed genes not previously associated with the disease.
生物医学研究人员常常希望探索由异常表达基因调控的发病机制和途径,比如通过微阵列分析鉴定出的那些基因。文献挖掘是协助完成这项任务的重要方式。现在有许多文献挖掘工具可供使用。然而,其中很少有工具能让用户进行手动调整,以精准聚焦于其特别想了解的内容。
我们展示了我们的软件程序GenCLiP(带文献概况的基因簇),它基于Chaussabel和Sher(《基因组生物学》2002年,3(10):RESEARCH0055)提出的方法,该方法通过搜索基因列表,基于最新的文献概况来识别基因功能簇。在此之前描述的方法基础上增加了四个功能:1)手动整理从文献中提取的关键词的能力;2)搜索与自定义关键词相关的基因和基因共现网络;3)将分析的基因结果与由GenCLiP生成的阴性和阳性对照进行比较;4)计算所得基因和基因网络随机相关的概率。在本文中,我们以瘢痕疙瘩与正常对照之间的一组差异表达基因为例,展示了GenCLiP中各项功能的应用如何成功识别出与瘢痕疙瘩发病机制相关的关键词以及参与瘢痕疙瘩发病机制的未知基因途径。
关于疾病易感性基因的识别,GenCLiP能让人快速获取初步的发病机制概况,并识别涉及先前与该疾病无关的异常表达基因的途径。