Vanderbilt-Ingram Cancer Center, Nashville, TN.
Rush University Medical Center, Chicago, IL.
JCO Clin Cancer Inform. 2021 Sep;5:995-1004. doi: 10.1200/CCI.21.00084.
The My Cancer Genome (MCG) knowledgebase and resulting website were launched in 2011 with the purpose of guiding clinicians in the application of genomic testing results for treatment of patients with cancer. Both knowledgebase and website were originally developed using a wiki-style approach that relied on manual evidence curation and synthesis of that evidence into cancer-related biomarker, disease, and pathway pages on the website that summarized the literature for a clinical audience. This approach required significant time investment for each page, which limited website scalability as the field advanced. To address this challenge, we designed and used an assertion-based data model that allows the knowledgebase and website to expand with the field of precision oncology.
Assertions, or computationally accessible cause and effect statements, are both manually curated from primary sources and imported from external databases and stored in a knowledge management system. To generate pages for the MCG website, reusable templates transform assertions into reconfigurable text and visualizations that form the building blocks for automatically updating disease, biomarker, drug, and clinical trial pages.
Combining text and graph templates with assertions in our knowledgebase allows generation of web pages that automatically update with our knowledgebase. Automated page generation empowers rapid scaling of the website as assertions with new biomarkers and drugs are added to the knowledgebase. This process has generated more than 9,100 clinical trial pages, 18,100 gene and alteration pages, 900 disease pages, and 2,700 drug pages to date.
Leveraging both computational and manual curation processes in combination with reusable templates empowers automation and scalability for both the MCG knowledgebase and MCG website.
My Cancer Genome(MCG)知识库及其相关网站于 2011 年推出,旨在为临床医生提供基因组检测结果在癌症患者治疗中的应用指导。知识库和网站最初均采用维基式方法开发,该方法依赖于手动证据整理,并将证据综合到网站上与癌症相关的生物标志物、疾病和途径页面中,为临床受众总结文献。这种方法需要为每个页面投入大量时间,从而限制了网站的可扩展性,因为该领域在不断发展。为了解决这个挑战,我们设计并使用了基于断言的数据模型,使知识库和网站能够随着精准肿瘤学领域的发展而扩展。
断言,或可计算的因果关系陈述,既可以从原始资料中手动整理,也可以从外部数据库中导入,并存储在知识管理系统中。为了为 MCG 网站生成页面,可重用模板将断言转换为可重新配置的文本和可视化效果,这些文本和可视化效果构成了自动更新疾病、生物标志物、药物和临床试验页面的构建块。
将知识库中的文本和图形模板与断言相结合,允许生成自动与知识库更新的网页。随着知识库中添加新的生物标志物和药物,自动页面生成使网站能够快速扩展。该过程迄今已生成了超过 9100 个临床试验页面、18100 个基因和变异页面、900 个疾病页面和 2700 个药物页面。
将计算和手动整理过程与可重用模板相结合,为 MCG 知识库和 MCG 网站提供了自动化和可扩展性。