Shin Dmitriy, Arthur Gerald, Popescu Mihail, Korkin Dmitry, Shyu Chi-Ren
University of Missouri, School of Medicine, Department of Pathology and Anatomical Sciences, Columbia, MO 65212, United States; University of Missouri, Graduate School, MU Informatics Institute, Columbia, MO 65211, United States.
University of Missouri, School of Medicine, Department of Pathology and Anatomical Sciences, Columbia, MO 65212, United States; University of Missouri, Graduate School, MU Informatics Institute, Columbia, MO 65211, United States.
J Biomed Inform. 2014 Dec;52:394-405. doi: 10.1016/j.jbi.2014.08.003. Epub 2014 Aug 19.
We developed Resource Description Framework (RDF)-induced InfluGrams (RIIG) - an informatics formalism to uncover complex relationships among biomarker proteins and biological pathways using the biomedical knowledge bases. We demonstrate an application of RIIG in morphoproteomics, a theranostic technique aimed at comprehensive analysis of protein circuitries to design effective therapeutic strategies in personalized medicine setting.
RIIG uses an RDF "mashup" knowledge base that integrates publicly available pathway and protein data with ontologies. To mine for RDF-induced Influence Links, RIIG introduces notions of RDF relevancy and RDF collider, which mimic conditional independence and "explaining away" mechanism in probabilistic systems. Using these notions and constraint-based structure learning algorithms, the formalism generates the morphoproteomic diagrams, which we call InfluGrams, for further analysis by experts.
RIIG was able to recover up to 90% of predefined influence links in a simulated environment using synthetic data and outperformed a naïve Monte Carlo sampling of random links. In clinical cases of Acute Lymphoblastic Leukemia (ALL) and Mesenchymal Chondrosarcoma, a significant level of concordance between the RIIG-generated and expert-built morphoproteomic diagrams was observed. In a clinical case of Squamous Cell Carcinoma, RIIG allowed selection of alternative therapeutic targets, the validity of which was supported by a systematic literature review. We have also illustrated an ability of RIIG to discover novel influence links in the general case of the ALL.
Applications of the RIIG formalism demonstrated its potential to uncover patient-specific complex relationships among biological entities to find effective drug targets in a personalized medicine setting. We conclude that RIIG provides an effective means not only to streamline morphoproteomic studies, but also to bridge curated biomedical knowledge and causal reasoning with the clinical data in general.
我们开发了资源描述框架(RDF)诱导的影响图(RIIG)——一种信息学形式体系,用于利用生物医学知识库揭示生物标志物蛋白与生物途径之间的复杂关系。我们展示了RIIG在形态蛋白质组学中的应用,这是一种治疗诊断技术,旨在全面分析蛋白质回路,以便在个性化医疗环境中设计有效的治疗策略。
RIIG使用一个RDF“混搭”知识库,该知识库将公开可用的途径和蛋白质数据与本体整合在一起。为了挖掘RDF诱导的影响链接,RIIG引入了RDF相关性和RDF碰撞器的概念,它们分别模拟概率系统中的条件独立性和“解释消除”机制。利用这些概念和基于约束的结构学习算法,该形式体系生成形态蛋白质组学图,我们称之为影响图,以供专家进一步分析。
RIIG能够在模拟环境中使用合成数据恢复高达90%的预定义影响链接,并且优于随机链接的简单蒙特卡罗抽样。在急性淋巴细胞白血病(ALL)和间叶性软骨肉瘤的临床病例中,观察到RIIG生成的和专家构建的形态蛋白质组学图之间存在显著程度的一致性。在鳞状细胞癌的临床病例中,RIIG允许选择替代治疗靶点,系统的文献综述支持了这些靶点的有效性。我们还展示了RIIG在ALL的一般情况下发现新的影响链接的能力。
RIIG形式体系的应用证明了其在个性化医疗环境中揭示生物实体之间患者特异性复杂关系以找到有效药物靶点的潜力。我们得出结论认为,RIIG不仅提供了一种有效的手段来简化形态蛋白质组学研究,而且总体上还能将精心策划的生物医学知识和因果推理与临床数据联系起来。