多种采样方案和深度学习可提高文献中药物-药物相互作用信息检索分析中的主动学习性能。

Multiple sampling schemes and deep learning improve active learning performance in drug-drug interaction information retrieval analysis from the literature.

机构信息

Department of Biomedical Informatics, Ohio State University, Columbus, OH, 43210, USA.

出版信息

J Biomed Semantics. 2023 May 30;14(1):5. doi: 10.1186/s13326-023-00287-7.

DOI:10.1186/s13326-023-00287-7

PMID:37248476

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10228061/

Abstract

BACKGROUND

Drug-drug interaction (DDI) information retrieval (IR) is an important natural language process (NLP) task from the PubMed literature. For the first time, active learning (AL) is studied in DDI IR analysis. DDI IR analysis from PubMed abstracts faces the challenges of relatively small positive DDI samples among overwhelmingly large negative samples. Random negative sampling and positive sampling are purposely designed to improve the efficiency of AL analysis. The consistency of random negative sampling and positive sampling is shown in the paper.

RESULTS

PubMed abstracts are divided into two pools. Screened pool contains all abstracts that pass the DDI keywords query in PubMed, while unscreened pool includes all the other abstracts. At a prespecified recall rate of 0.95, DDI IR analysis precision is evaluated and compared. In screened pool IR analysis using supporting vector machine (SVM), similarity sampling plus uncertainty sampling improves the precision over uncertainty sampling, from 0.89 to 0.92 respectively. In the unscreened pool IR analysis, the integrated random negative sampling, positive sampling, and similarity sampling improve the precision over uncertainty sampling along, from 0.72 to 0.81 respectively. When we change the SVM to a deep learning method, all sampling schemes consistently improve DDI AL analysis in both screened pool and unscreened pool. Deep learning has significant improvement of precision over SVM, 0.96 vs. 0.92 in screened pool, and 0.90 vs. 0.81 in the unscreened pool, respectively.

CONCLUSIONS

By integrating various sampling schemes and deep learning algorithms into AL, the DDI IR analysis from literature is significantly improved. The random negative sampling and positive sampling are highly effective methods in improving AL analysis where the positive and negative samples are extremely imbalanced.

摘要

背景

药物-药物相互作用（DDI）信息检索（IR）是从 PubMed 文献中进行的一项重要自然语言处理（NLP）任务。本文首次研究了主动学习（AL）在 DDI IR 分析中的应用。从 PubMed 摘要中进行 DDI IR 分析面临的挑战是，阳性 DDI 样本在大量阴性样本中相对较少。本文设计了随机负采样和正采样来提高 AL 分析的效率。本文展示了随机负采样和正采样的一致性。

结果

将 PubMed 摘要分为两个池。筛选池包含所有在 PubMed 中通过 DDI 关键字查询的摘要，而未筛选池则包含所有其他摘要。在指定的召回率为 0.95 时，评估并比较了 DDI IR 分析的精度。在使用支持向量机（SVM）的筛选池 IR 分析中，相似性采样加不确定性采样将精度从 0.89 提高到 0.92。在未筛选池 IR 分析中，综合随机负采样、正采样和相似性采样将精度从 0.72 提高到 0.81。当我们将 SVM 改为深度学习方法时，所有采样方案在筛选池和未筛选池中的 DDI AL 分析中都得到了一致的提高。深度学习在精度上对 SVM 有显著的提高，在筛选池中的精度为 0.96 对 0.92，在未筛选池中的精度为 0.90 对 0.81。

结论

通过将各种采样方案和深度学习算法集成到 AL 中，大大提高了文献中的 DDI IR 分析。随机负采样和正采样是在正负样本极不平衡的情况下提高 AL 分析效率的有效方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/096c/10228061/4f5da4897301/13326_2023_287_Fig1_HTML.jpg

相似文献

Multiple sampling schemes and deep learning improve active learning performance in drug-drug interaction information retrieval analysis from the literature.多种采样方案和深度学习可提高文献中药物-药物相互作用信息检索分析中的主动学习性能。

J Biomed Semantics. 2023 May 30;14(1):5. doi: 10.1186/s13326-023-00287-7.

Integrated Random Negative Sampling and Uncertainty Sampling in Active Learning Improve Clinical Drug Safety Drug-Drug Interaction Information Retrieval.主动学习中的集成随机负采样与不确定性采样改进临床药物安全性药物-药物相互作用信息检索

Front Pharmacol. 2021 Apr 23;11:582470. doi: 10.3389/fphar.2020.582470. eCollection 2020.

Deep learning-enabled natural language processing to identify directional pharmacokinetic drug-drug interactions.深度学习赋能的自然语言处理用于识别有方向的药代动力学药物相互作用。

BMC Bioinformatics. 2023 Nov 1;24(1):413. doi: 10.1186/s12859-023-05520-9.

CuDDI: A CUDA-Based Application for Extracting Drug-Drug Interaction Related Substance Terms from PubMed Literature.CuDDI：一种基于 CUDA 的应用程序，用于从 PubMed 文献中提取药物相互作用相关物质术语。

Molecules. 2019 Mar 19;24(6):1081. doi: 10.3390/molecules24061081.

Predicting drug-drug interactions using multi-modal deep auto-encoders based network embedding and positive-unlabeled learning.基于多模态深度自动编码器的网络嵌入和正无标签学习预测药物-药物相互作用。

Methods. 2020 Jul 1;179:37-46. doi: 10.1016/j.ymeth.2020.05.007. Epub 2020 Jun 1.

DDI-PULearn: a positive-unlabeled learning method for large-scale prediction of drug-drug interactions.DDI-PULearn：一种用于大规模药物相互作用预测的正无标签学习方法。

BMC Bioinformatics. 2019 Dec 24;20(Suppl 19):661. doi: 10.1186/s12859-019-3214-6.

J Clin Pharm Ther. 2019 Apr;44(2):268-275. doi: 10.1111/jcpt.12786. Epub 2018 Dec 18.

Deep generative learning for automated EHR diagnosis of traditional Chinese medicine.基于深度学习的中医电子病历自动化诊断

Comput Methods Programs Biomed. 2019 Jun;174:17-23. doi: 10.1016/j.cmpb.2018.05.008. Epub 2018 May 4.

Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.基于生物医学语料库预训练的句子嵌入的深度学习提高了在电子病历中查找相似句子的性能。

BMC Med Inform Decis Mak. 2020 Apr 30;20(Suppl 1):73. doi: 10.1186/s12911-020-1044-0.

Using a shallow linguistic kernel for drug-drug interaction extraction.利用浅层语言核进行药物相互作用提取。

J Biomed Inform. 2011 Oct;44(5):789-804. doi: 10.1016/j.jbi.2011.04.005. Epub 2011 Apr 24.

本文引用的文献

IMSE: interaction information attention and molecular structure based drug drug interaction extraction.IMSE：基于相互作用信息、注意力和分子结构的药物-药物相互作用提取。

BMC Bioinformatics. 2022 Aug 14;23(Suppl 7):338. doi: 10.1186/s12859-022-04876-8.

A Text Mining Protocol for Extracting Drug-Drug Interaction and Adverse Drug Reactions Specific to Patient Population, Pharmacokinetics, Pharmacodynamics, and Disease.一种用于提取特定于患者人群、药代动力学、药效学和疾病的药物-药物相互作用和药物不良反应的文本挖掘协议。

Methods Mol Biol. 2022;2496:259-282. doi: 10.1007/978-1-0716-2305-3_14.

A Text Mining Protocol for Predicting Drug-Drug Interaction and Adverse Drug Reactions from PubMed Articles.一种从 PubMed 文章中预测药物-药物相互作用和药物不良反应的文本挖掘协议。

Methods Mol Biol. 2022;2496:237-258. doi: 10.1007/978-1-0716-2305-3_13.

Translational drug-interaction corpus.药物相互作用翻译语料库。

Database (Oxford). 2022 May 18;2022. doi: 10.1093/database/baac031.

Front Pharmacol. 2021 Apr 23;11:582470. doi: 10.3389/fphar.2020.582470. eCollection 2020.

Statin-induced rhabdomyolysis from azithromycin interaction in a patient with heterozygous SLCO1B1 polymorphism.载脂蛋白 E 基因型与辛伐他汀降脂疗效及不良反应相关性的研究进展

J Clin Pharm Ther. 2021 Jun;46(3):853-855. doi: 10.1111/jcpt.13327. Epub 2020 Dec 5.

PK-DB: pharmacokinetics database for individualized and stratified computational modeling.PK-DB：用于个体化和分层计算建模的药代动力学数据库。

Nucleic Acids Res. 2021 Jan 8;49(D1):D1358-D1364. doi: 10.1093/nar/gkaa990.

Using drug descriptions and molecular structures for drug-drug interaction extraction from literature.从文献中提取药物-药物相互作用的药物描述和分子结构。

Bioinformatics. 2021 Jul 19;37(12):1739-1746. doi: 10.1093/bioinformatics/btaa907.

A detection method for android application security based on TF-IDF and machine learning.基于 TF-IDF 和机器学习的安卓应用安全检测方法。

PLoS One. 2020 Sep 11;15(9):e0238694. doi: 10.1371/journal.pone.0238694. eCollection 2020.

Atorvastatin-linked rhabdomyolysis caused by the simultaneous intake of amoxicillin clavulanic acid.同时服用阿莫西林克拉维酸导致的阿托伐他汀相关性横纹肌溶解症。

J Basic Clin Physiol Pharmacol. 2020 Sep 8. doi: 10.1515/jbcpp-2020-0108.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

多种采样方案和深度学习可提高文献中药物-药物相互作用信息检索分析中的主动学习性能。

Multiple sampling schemes and deep learning improve active learning performance in drug-drug interaction information retrieval analysis from the literature.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献