Ujiie Shogo, Yada Shuntaro, Wakamiya Shoko, Aramaki Eiji
Nara Institute of Science and Technology, Nara, Japan.
JMIR Med Inform. 2020 Nov 27;8(11):e22661. doi: 10.2196/22661.
Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor.
Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies.
Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system.
Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations.
A simple automated system may alleviate the manual labor involved in screening drug safety-related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.
制药公司会系统地报告涵盖药品不良事件(ADEs)的医学文章,以获取药物安全信息。尽管各国和各地区向监管机构报告的政策有所不同,但所有医学文章报告都可分为基于精确性或基于召回的报告。日本实施的基于召回的报告要求报告任何可能的药品不良事件。因此,基于召回的报告可能会引入大量假阴性结果或大量噪声,这一问题难以通过有限的人工来解决。
我们的目标是开发一个自动化系统,该系统能够识别与药品不良事件相关的医学文章,支持基于召回的报告,并减轻日本制药公司的人工负担。
我们的系统以医学文章为输入,基于自然语言处理进行文档级分类,以提取包含药品不良事件的文章(取代初次筛选中的人工操作),并进行句子级分类,以提取这些文章中暗示药品不良事件的句子(从而在二次筛选中为专家提供支持)。我们使用了由一名医学工程师标注的509篇日语医学文章来评估所提出系统的性能。
文档级分类的F1值为0.903。句子级分类的F1值为0.413。这些是五折交叉验证的平均值。
一个简单的自动化系统可能会减轻制药公司筛选与药物安全相关医学文章所涉及的人工负担。在通过考虑更广泛的上下文提高句子级分类的准确性之后,我们打算将该系统应用于实际的上市后监测。