School of Library and Information Science, Indiana University, Bloomington, Indiana, United States of America.
PLoS One. 2011;6(12):e27506. doi: 10.1371/journal.pone.0027506. Epub 2011 Dec 6.
Much life science and biology research requires an understanding of complex relationships between biological entities (genes, compounds, pathways, diseases, and so on). There is a wealth of data on such relationships in publicly available datasets and publications, but these sources are overlapped and distributed so that finding pertinent relational data is increasingly difficult. Whilst most public datasets have associated tools for searching, there is a lack of searching methods that can cross data sources and that in particular search not only based on the biological entities themselves but also on the relationships between them. In this paper, we demonstrate how graph-theoretic algorithms for mining relational paths can be used together with a previous integrative data resource we developed called Chem2Bio2RDF to extract new biological insights about the relationships between such entities. In particular, we use these methods to investigate the genetic basis of side-effects of thiazolinedione drugs, and in particular make a hypothesis for the recently discovered cardiac side-effects of Rosiglitazone (Avandia) and a prediction for Pioglitazone which is backed up by recent clinical studies.
许多生命科学和生物学研究都需要理解生物实体(基因、化合物、途径、疾病等)之间的复杂关系。在公开数据集和出版物中有大量关于这些关系的数据,但这些来源是重叠和分散的,以至于越来越难以找到相关的关系数据。虽然大多数公共数据集都有用于搜索的相关工具,但缺乏可以跨数据源搜索的搜索方法,特别是不仅要基于生物实体本身,还要基于它们之间的关系进行搜索的方法。在本文中,我们展示了如何将挖掘关系路径的图论算法与我们之前开发的名为 Chem2Bio2RDF 的综合数据资源结合使用,以提取有关这些实体之间关系的新生物学见解。特别是,我们使用这些方法来研究噻唑烷二酮类药物副作用的遗传基础,特别是对罗格列酮(文迪雅)最近发现的心脏副作用提出假设,并对吡格列酮进行预测,该预测得到了最近临床研究的支持。