Salis Constantinos, Papakonstantinou Eleni, Pierouli Katerina, Mitsis Athanasios, Basdeki Lia, Megalooikonomou Vasileios, Vlachakis Dimitrios, Hagidimitriou Marianna
Laboratory of Genetics, Department of Biotechnology, School of Food, Biotechnology and Development, Agricultural University of Athens, Athens, Greece.
Computer Engineering and Informatics Department, School of Engineering, University of Patras, Patras, Greece.
EMBnet J. 2019;24. doi: 10.14806/ej.24.0.922. Epub 2019 May 22.
In the big data era, conventional bioinformatics seems to fail in managing the full extent of the available genomic information. The current study is focused on olive tree species and the collection and analysis of genetic and genomic data, which are fragmented in various depositories. Extra virgin olive oil is classified as a medical food, due to nutraceutical benefits and its protective properties against cancer, cardiovascular diseases, age-related diseases, neurodegenerative disorders, and many other diseases. Extensive studies have reported the benefits of olive oil on human health. However, available data at the nucleotide sequence level are highly unstructured. Towards this aim, we describe an approach that combines methods from data mining and machine learning pipelines to ontology classification and semantic annotation. Fusing and analysing all available olive tree data is a step of uttermost importance in classifying and characterising the various cultivars, towards a comprehensive approach under the context of food safety and public health.
在大数据时代,传统生物信息学似乎难以处理所有可用的基因组信息。当前的研究聚焦于橄榄树种以及遗传和基因组数据的收集与分析,这些数据分散在各个存储库中。特级初榨橄榄油因其营养保健功效以及对癌症、心血管疾病、老年相关疾病、神经退行性疾病和许多其他疾病的预防特性,被归类为医疗食品。大量研究报告了橄榄油对人类健康的益处。然而,核苷酸序列水平上的现有数据高度无结构化。为此,我们描述了一种将数据挖掘和机器学习流程中的方法与本体分类和语义注释相结合的方法。融合和分析所有可用的橄榄树数据,对于在食品安全和公共卫生背景下全面分类和表征各种品种而言,是至关重要的一步。