Rebholz-Schuhmann Dietrich, Grabmüller Christoph, Kavaliauskas Silvestras, Croset Samuel, Woollard Peter, Backofen Rolf, Filsell Wendy, Clark Dominic
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK; Computerlinguistik, Universität Zürich, Binzmühlestrasse 14, 8050 Zürich, Switzerland.
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Drug Discov Today. 2014 Jul;19(7):882-9. doi: 10.1016/j.drudis.2013.10.024. Epub 2013 Nov 4.
In the Semantic Enrichment of the Scientific Literature (SESL) project, researchers from academia and from life science and publishing companies collaborated in a pre-competitive way to integrate and share information for type 2 diabetes mellitus (T2DM) in adults. This case study exposes benefits from semantic interoperability after integrating the scientific literature with biomedical data resources, such as UniProt Knowledgebase (UniProtKB) and the Gene Expression Atlas (GXA). We annotated scientific documents in a standardized way, by applying public terminological resources for diseases and proteins, and other text-mining approaches. Eventually, we compared the genetic causes of T2DM across the data resources to demonstrate the benefits from the SESL triple store. Our solution enables publishers to distribute their content with little overhead into remote data infrastructures, such as into any Virtual Knowledge Broker.
在科学文献语义增强(SESL)项目中,来自学术界、生命科学领域以及出版公司的研究人员以竞争前合作的方式,整合并共享成人2型糖尿病(T2DM)的信息。本案例研究揭示了将科学文献与生物医学数据资源(如UniProt知识库(UniProtKB)和基因表达图谱(GXA))整合后,语义互操作性带来的益处。我们通过应用疾病和蛋白质的公共术语资源以及其他文本挖掘方法,以标准化方式注释科学文档。最终,我们跨数据资源比较了T2DM的遗传病因,以证明SESL三元组存储的优势。我们的解决方案使出版商能够以较低的开销将其内容分发到远程数据基础设施中,例如任何虚拟知识代理。