Center for Clinical and Translational Science, University of Vermont, Burlington, Vermont 05405, USA.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):249-54. doi: 10.1136/amiajnl-2011-000480. Epub 2012 Jan 6.
The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models.
A vector space model approach was developed that bridges gene disease knowledge inferred across three knowledge bases: Online Mendelian Inheritance in Man, GenBank, and Medline. The approach was then used to identify potentially related diseases for two target diseases: Alzheimer disease and Prader-Willi Syndrome.
In the case of both Alzheimer Disease and Prader-Willi Syndrome, a set of plausible diseases were identified that may warrant further exploration.
This study furthers seminal work by Swanson, et al. that demonstrated the potential for mining literature for putative correlations. Using a vector space modeling approach, information from both biomedical literature and genomic resources (like GenBank) can be combined towards identification of putative correlations of interest. To this end, the relevance of the predicted diseases of interest in this study using the vector space modeling approach were validated based on supporting literature.
The results of this study suggest that a vector space model approach may be a useful means to identify potential relationships between complex diseases, and thereby enable the coordination of gene-based findings across multiple complex diseases.
疾病与其致病基因之间的关系可能很复杂,尤其是多基因疾病。进一步加剧这些疾病研究挑战的是,许多基因可能与多种疾病有因果关系。本研究通过采用信息检索领域首创的方法——向量空间模型,探索了疾病之间的关系。
开发了一种向量空间模型方法,该方法跨越三个知识库(在线孟德尔遗传数据库、GenBank 和 Medline)推断基因疾病知识。然后,该方法用于识别两个目标疾病(阿尔茨海默病和普拉德-威利综合征)的潜在相关疾病。
在阿尔茨海默病和普拉德-威利综合征的情况下,确定了一组可能需要进一步探索的合理疾病。
本研究进一步推进了 Swanson 等人的开创性工作,证明了从文献中挖掘潜在相关性的潜力。通过向量空间建模方法,可以将来自生物医学文献和基因组资源(如 GenBank)的信息结合起来,以识别潜在的相关感兴趣的相关性。为此,本研究使用向量空间建模方法对预测的相关疾病的相关性进行了验证,以支持文献。
本研究结果表明,向量空间模型方法可能是一种识别复杂疾病之间潜在关系的有用方法,从而能够协调多个复杂疾病中的基因发现。