European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD, UK.
BMC Bioinformatics. 2017 Dec 21;18(Suppl 17):558. doi: 10.1186/s12859-017-1980-6.
Data-driven cell classification is becoming common and is now being implemented on a massive scale by projects such as the Human Cell Atlas. The scale of these efforts poses a challenge. How can the results be made searchable and accessible to biologists in general? How can they be related back to the rich classical knowledge of cell-types, anatomy and development? How will data from the various types of single cell analysis be made cross-searchable? Structured annotation with ontology terms provides a potential solution to these problems. In turn, there is great potential for using the outputs of data-driven cell classification to structure ontologies and integrate them with data-driven cell query systems.
Focusing on examples from the mouse retina and Drosophila olfactory system, I present worked examples illustrating how formalization of cell ontologies can enhance querying of data-driven cell-classifications and how ontologies can be extended by integrating the outputs of data-driven cell classifications.
Annotation with ontology terms can play an important role in making data driven classifications searchable and query-able, but fulfilling this potential requires standardized formal patterns for structuring ontologies and annotations and for linking ontologies to the outputs of data-driven classification.
数据驱动的细胞分类正变得越来越普遍,现在诸如人类细胞图谱(Human Cell Atlas)等项目正在大规模地实施。这些工作的规模带来了一个挑战。如何使这些结果能够被一般的生物学家搜索和访问?如何将它们与细胞类型、解剖结构和发育的丰富的经典知识联系起来?如何使各种单细胞分析的数据能够进行交叉搜索?使用本体论术语进行结构化注释为这些问题提供了一个潜在的解决方案。反过来,利用数据驱动的细胞分类的输出来构建本体论并将其与数据驱动的细胞查询系统集成具有巨大的潜力。
以小鼠视网膜和果蝇嗅觉系统为例,我展示了一些工作实例,说明了如何通过细胞本体论的形式化来增强对数据驱动的细胞分类的查询,以及如何通过整合数据驱动的细胞分类的输出来扩展本体论。
本体论术语的注释可以在使数据驱动的分类具有可搜索性和可查询性方面发挥重要作用,但要充分发挥这一潜力,需要标准化的本体论和注释结构模式,以及将本体论链接到数据驱动的分类输出的模式。