Kahn Charles E
Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Stud Health Technol Inform. 2017;245:896-900.
This study sought to use ontology-based knowledge to identify patients with rare diseases and to estimate the frequency of those diseases in a large database of radiology reports. Natural language processing methods were applied to 12,377,743 narrarive-text radiology reports of 7,803,811 patients at an academic health system. Using knowledge from the Orphanet Rare Disease Ontology and Radiology Gamuts Ontology, 1,154 of 6,794 rare diseases (17.0%) were observed in a total of 237,840 patients (3.05%). Ninety of 2,129 diseases (4%) with known prevalence less than 1 per 1,000,000 were observed in the database, whereas 100 of 173 diseases (58%) with prevalence greater than 1 per 10,000 were observed; the difference was statistically significant (p < .00001). Automated ontology-based search of radiology reports can estimate the frequency of rare diseases, and those diseases with higher known prevalence were significantly more likely to appear in radiology reports.
本研究旨在利用基于本体的知识来识别罕见病患者,并在一个大型放射学报告数据库中估计这些疾病的发病率。自然语言处理方法应用于某学术健康系统中7803811名患者的12377743份叙述性文本放射学报告。利用来自《孤儿病数据库》罕见病本体和放射学色域本体的知识,在总共237840名患者(3.05%)中观察到了6794种罕见病中的1154种(17.0%)。数据库中观察到了已知患病率低于百万分之一的2129种疾病中的90种(4%),而患病率高于万分之一的173种疾病中的100种(58%)也被观察到;差异具有统计学意义(p <.00001)。基于本体的放射学报告自动搜索可以估计罕见病的发病率,并且已知患病率较高的疾病更有可能出现在放射学报告中。