Suppr超能文献

基于聚类的分层方法可最小化信号检测中的数据掩蔽效应。

A stratification method based on clustering for the minimization of data masking effect in signal detection.

机构信息

School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China.

Jiangsu Center for ADR Monitoring, Nanjing, 210002, China.

出版信息

BMC Med Inform Decis Mak. 2020 Feb 3;20(1):18. doi: 10.1186/s12911-020-1037-z.

Abstract

BACKGROUND

Data masking is an inborn defect of measures of disproportionality in adverse drug reactions (ADRs) signal detection. Many previous studies can be roughly classified into three categories: data removal, regression and stratification. However, frequency differences of adverse drug events (ADEs) reports, which would be an important factor of masking, were not considered in these methods. The aim of this study is to explore a novel stratification method for minimizing the impact of frequency differences on real signals masking.

METHODS

Reports in the Chinese Spontaneous Reporting Database (CSRD) between 2010 and 2011 were selected. The overall dataset was stratified into some clusters by the frequency of drugs, ADRs, and drug-event combinations (DECs) in sequence. K-means clustering was used to conduct stratification according to data distribution characteristics. The Information Component (IC) was adopted for signal detection in each cluster respectively. By extracting ADRs from drug product labeling, a reference database was introduced for performance evaluation based on Recall, Precision and F-measure. In addition, some DECs from the Adverse Drug Reactions Information Bulletin (ADRIB) issued by CFDA were collected for further reliability evaluation.

RESULTS

With stratification, the study dataset was divided into 21 clusters, among which the frequency of DRUGs, ADRs or DECs followed the similar order of magnitude respectively. Recall increased by 34.95% from 29.93 to 40.39%, Precision reduced by 10.52% from 54.56 to 48.82%, while F-measure increased by 14.39% from 38.65 to 44.21%. According to ADRIB after 2011, 5 DECs related to Potassium Magnesium Aspartate, 61 DECs related to Levofloxacin Hydrochloride and 26 DECs related to Cefazolin were highlighted.

CONCLUSIONS

The proposed method is effectively and reliably for the minimization of data masking effect in signal detection. Considering the decrease of Precision, it is suggested to be a supplement rather than an alternative to non-stratification method.

摘要

背景

数据掩蔽是药物不良反应(ADR)信号检测中比例失调度量的固有缺陷。许多先前的研究大致可以分为三类:数据删除、回归和分层。然而,这些方法并未考虑到不良药物事件(ADE)报告的频率差异,这将是掩蔽的一个重要因素。本研究旨在探索一种新的分层方法,以最小化频率差异对真实信号掩蔽的影响。

方法

选择了 2010 年至 2011 年中国自发报告数据库(CSRD)中的报告。该数据集首先根据药物、ADR 和药物事件组合(DEC)的频率顺序分层为一些簇。K-均值聚类根据数据分布特征进行分层。分别采用信息分量(IC)在每个簇中进行信号检测。通过从药品标签中提取 ADR,引入参考数据库,根据召回率、精度和 F 度量进行性能评估。此外,还收集了 CFDA 发布的《药品不良反应信息通报》(ADRIB)中的一些 DEC 进行进一步的可靠性评估。

结果

通过分层,研究数据集被分为 21 个簇,其中 DRUGs、ADRs 或 DECs 的频率依次遵循相似的数量级。召回率从 29.93%提高到 40.39%,提高了 34.95%;精度从 54.56%降低到 48.82%,降低了 10.52%;F 度量从 38.65%提高到 44.21%,提高了 14.39%。根据 2011 年后的 ADRIB,突出了 5 个与天门冬氨酸钾镁相关的 DEC、61 个与盐酸左氧氟沙星相关的 DEC 和 26 个与头孢唑林相关的 DEC。

结论

所提出的方法有效地、可靠地降低了信号检测中数据掩蔽的影响。考虑到精度的降低,建议将其作为非分层方法的补充,而不是替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c69f/6998200/b5e40391c45d/12911_2020_1037_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验