Guo Wenjing, Dong Fan, Liu Jie, Aslam Aasma, Patterson Tucker A, Hong Huixiao
National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, United States.
Exp Biol Med (Maywood). 2025 May 2;250:10374. doi: 10.3389/ebm.2025.10374. eCollection 2025.
Adverse drug events are harms associated with drug use, whether the drug is used correctly or incorrectly. Identifying adverse drug events is vital in pharmacovigilance to safeguard public health. Drug safety surveillance can be performed using unstructured data. A comprehensive and accurate list of drug names is essential for effective identification of adverse drug events. While there are numerous sources for drug names, RxNorm is widely recognized as a leading resource. However, its effectiveness for unstructured data analysis in drug safety surveillance has not been thoroughly assessed. To address this, we evaluated the drug names in RxNorm for their suitability in unstructured data analysis and developed a refined set of drug names. Initially, we removed duplicates, the names exceeding 199 characters, and those that only describe administrative details. Drug names with four or fewer characters were analyzed using 18,000 drug-related PubMed abstracts to remove names which rarely appear in unstructured data. The remaining names, which ranged from five to 199 characters, were further refined to exclude those that could lead to inaccurate drug counts in unstructured data analysis. We compared the efficiency and accuracy of the refined set with the original RxNorm set by testing both on the 18,000 drug-related PubMed abstracts. The results showed a decrease in both computational cost and the number of false drug names identified. Further analysis of the removed names revealed that most originated from only one of the 14 sources. Our findings suggest that the refined set can enhance drug identification in unstructured data analysis, thereby improving pharmacovigilance.
药物不良事件是与药物使用相关的危害,无论药物使用是否正确。识别药物不良事件对于药物警戒以保障公众健康至关重要。药物安全监测可以使用非结构化数据来进行。一份全面且准确的药物名称列表对于有效识别药物不良事件至关重要。虽然有众多药物名称来源,但RxNorm被广泛认为是主要资源。然而,其在药物安全监测中非结构化数据分析的有效性尚未得到充分评估。为解决此问题,我们评估了RxNorm中的药物名称在非结构化数据分析中的适用性,并开发了一组经过优化的药物名称。最初,我们去除了重复项、超过199个字符的名称以及仅描述管理细节的名称。使用18000篇与药物相关的PubMed摘要对四个或更少字符的药物名称进行分析,以去除在非结构化数据中很少出现的名称。其余长度在五到199个字符之间的名称进一步优化,以排除那些在非结构化数据分析中可能导致药物计数不准确的名称。我们通过在18000篇与药物相关的PubMed摘要上进行测试,比较了优化集与原始RxNorm集的效率和准确性。结果显示计算成本和识别出的错误药物名称数量均有所减少。对去除的名称进行进一步分析发现,大多数仅来自14个来源中的一个。我们的研究结果表明,优化集可以增强非结构化数据分析中的药物识别,从而改善药物警戒。