Suppr超能文献

构建用于临床测序数据整合与管理的癌症诊断文本到肿瘤树映射管道

Building Cancer Diagnosis Text to OncoTree Mapping Pipelines for Clinical Sequencing Data Integration and Curation.

作者信息

Narayanan Adhithya, Topaloglu Umit, Laurini Javier A, Diaz-Garelli Franck

机构信息

University of North Carolina at Chapel Hill, Chapel Hill, NC.

Wake Forest Baptist Medical Center, Winston Salem, NC.

出版信息

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:440-448. eCollection 2020.

Abstract

Precision oncology research seeks to derive knowledge from existing data. Current work seeks to integrate clinical and genomic data across cancer centers to enable impactful secondary use. However, integrated data reliability depends on the data curation method used and its systematicity. In practice, data integration and mapping are often done manually even though crucial data such as oncological diagnoses (DX) show varying accuracy and specificity levels. We hypothesized that mapping of text-form cancer DX to a standardized terminology (OncoTree) could be automated using existing methods (e.g. natural language processing (NLP) modules and application programming interfaces [APIs]). We found that our best-performing pipeline prototype was effective but limited by API development limitations (accurately mapped 96.2% of textual DX dataset to NCI Thesaurus (NCIt), 44.2% through NCIt to OncoTree). These results suggest the pipeline model could be viable to automate data curation. Such techniques may become increasingly more reliable with further development.

摘要

精准肿瘤学研究旨在从现有数据中获取知识。当前的工作致力于整合各癌症中心的临床和基因组数据,以实现有影响力的二次利用。然而,整合数据的可靠性取决于所使用的数据管理方法及其系统性。在实践中,即使诸如肿瘤诊断(DX)等关键数据的准确性和特异性水平各不相同,数据整合和映射通常仍由人工完成。我们假设可以使用现有方法(如自然语言处理(NLP)模块和应用程序编程接口 [API])将文本形式的癌症DX映射到标准化术语(肿瘤树状图)。我们发现,我们性能最佳的管道原型是有效的,但受到API开发限制(将96.2%的文本DX数据集准确映射到美国国立癌症研究所叙词表(NCIt),通过NCIt映射到肿瘤树状图的比例为44.2%)。这些结果表明该管道模型对于自动化数据管理可能是可行的。随着进一步发展,此类技术可能会变得越来越可靠。

相似文献

本文引用的文献

10
Perspectives on making big data analytics work for oncology.关于使大数据分析在肿瘤学中发挥作用的观点。
Methods. 2016 Dec 1;111:32-44. doi: 10.1016/j.ymeth.2016.08.010. Epub 2016 Aug 29.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验