Christie Karen R, Blake Judith A
The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609 USA.
Cilia. 2018 Apr 19;7:3. doi: 10.1186/s13630-018-0057-0. eCollection 2018.
Cilia are specialized, hair-like structures that project from the cell bodies of eukaryotic cells. With increased understanding of the distribution and functions of various types of cilia, interest in these organelles is accelerating. To effectively use this great expansion in knowledge, this information must be made digitally accessible and available for large-scale analytical and computational investigation. Capture and integration of knowledge about cilia into existing knowledge bases, thus providing the ability to improve comparative genomic data analysis, is the objective of this work.
We focused on the capture of information about cilia as studied in the laboratory mouse, a primary model of human biology. The workflow developed establishes a standard for capture of comparative functional data relevant to human biology. We established the 310 closest mouse orthologs of the 302 human genes defined in the SYSCILIA Gold Standard set of ciliary genes. For the mouse genes, we identified biomedical literature for curation and used Gene Ontology (GO) curation paradigms to provide functional annotations from these publications.
Employing a methodology for comprehensive capture of experimental data about cilia genes in structured, digital form, we established a workflow for curation of experimental literature detailing molecular function and roles of cilia proteins starting with the mouse orthologs of the human SYSCILIA gene set. We worked closely with the GO Consortium ontology development editors and the SYSCILIA Consortium to improve the representation of ciliary biology within the GO. During the time frame of the ontology improvement project, we have fully curated 134 of these 310 mouse genes, resulting in an increase in the number of ciliary and other experimental annotations.
We have improved the GO annotations available for mouse genes orthologous to the human genes in the SYSCILIA Consortium's Gold Standard set. In addition, ciliary terminology in the GO itself was improved in collaboration with GO ontology developers and the SYSCILIA Consortium. These improvements to the GO terms for the functions and roles of ciliary proteins, along with the increase in annotations of the corresponding genes, enhance the representation of ciliary processes and localizations and improve access to these data during large-scale bioinformatic analyses.
纤毛是从真核细胞的细胞体伸出的特化的毛发状结构。随着对各种类型纤毛的分布和功能的了解不断增加,对这些细胞器的兴趣也在加速增长。为了有效地利用这一知识的巨大扩展,必须使这些信息以数字方式可访问,并可用于大规模的分析和计算研究。将有关纤毛的知识捕获并整合到现有知识库中,从而提高比较基因组数据分析的能力,是这项工作的目标。
我们专注于在实验室小鼠(人类生物学的主要模型)中研究的纤毛信息的捕获。所开发的工作流程为捕获与人类生物学相关的比较功能数据建立了标准。我们确定了SYSCILIA纤毛基因金标准集中定义的302个人类基因的310个最接近的小鼠直系同源基因。对于小鼠基因,我们识别出用于整理的生物医学文献,并使用基因本体论(GO)整理范式从这些出版物中提供功能注释。
采用一种以结构化数字形式全面捕获纤毛基因实验数据的方法,我们建立了一个整理实验文献的工作流程,该流程从人类SYSCILIA基因集的小鼠直系同源基因开始,详细描述纤毛蛋白的分子功能和作用。我们与GO联盟本体开发编辑人员和SYSCILIA联盟密切合作,以改善GO中纤毛生物学的表示。在本体改进项目的时间范围内,我们已经完全整理了这310个小鼠基因中的134个,导致纤毛和其他实验注释的数量增加。
我们改进了与SYSCILIA联盟金标准集中人类基因直系同源的小鼠基因的GO注释。此外,与GO本体开发人员和SYSCILIA联盟合作,改进了GO本身中的纤毛术语。纤毛蛋白功能和作用的GO术语的这些改进,以及相应基因注释的增加,增强了纤毛过程和定位的表示,并改善了大规模生物信息学分析期间对这些数据的访问。