Institute for Molecular Bioscience and ARC Centre of Excellence in Bioinformatics, The University of Queensland, Brisbane, QLD 4072, Australia.
J Chem Inf Model. 2010 May 24;50(5):732-41. doi: 10.1021/ci900461j.
A wide range of data on sequences, structures, pathways, and networks of genes and gene products is available for hypothesis testing and discovery in biological and biomedical research. However, data describing the physical, chemical, and biological properties of small molecules have not been well-integrated with these resources. Semantically rich representations of chemical data, combined with Semantic Web technologies, have the potential to enable the integration of small molecule and biomolecular data resources, expanding the scope and power of biomedical and pharmacological research. We employed the Semantic Web technologies Resource Description Framework (RDF) and Web Ontology Language (OWL) to generate a Small Molecule Ontology (SMO) that represents concepts and provides unique identifiers for biologically relevant properties of small molecules and their interactions with biomolecules, such as proteins. We instanced SMO using data from three public data sources, i.e., DrugBank, PubChem and UniProt, and converted to RDF triples. Evaluation of SMO by use of predetermined competency questions implemented as SPARQL queries demonstrated that data from chemical and biomolecular data sources were effectively represented and that useful knowledge can be extracted. These results illustrate the potential of Semantic Web technologies in chemical, biological, and pharmacological research and in drug discovery.
大量关于基因和基因产物的序列、结构、途径和网络的数据可用于生物和生物医学研究中的假设检验和发现。然而,描述小分子的物理、化学和生物学特性的数据尚未与这些资源很好地整合。语义丰富的化学数据表示形式,结合语义 Web 技术,具有实现小分子和生物分子数据资源集成的潜力,从而扩展生物医学和药物研究的范围和能力。我们使用语义 Web 技术资源描述框架(RDF)和 Web 本体语言(OWL)生成了一个小分子本体(SMO),该本体表示小分子的生物学相关特性及其与生物分子(如蛋白质)相互作用的概念,并为其提供了唯一标识符。我们使用来自三个公共数据源(即 DrugBank、PubChem 和 UniProt)的数据实例化了 SMO,并将其转换为 RDF 三元组。使用作为 SPARQL 查询实现的预定能力问题对 SMO 进行评估表明,化学和生物分子数据源中的数据得到了有效表示,并且可以提取有用的知识。这些结果说明了语义 Web 技术在化学、生物和药物研究以及药物发现中的潜力。