• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于结构化标签空间的少样本和零样本多标签学习

Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces.

作者信息

Rios Anthony, Kavuluru Ramakanth

机构信息

Department of Computer Science, University of Kentucky, Lexington, KY.

Division of Biomedical Informatics, University of Kentucky, Lexington, KY.

出版信息

Proc Conf Empir Methods Nat Lang Process. 2018 Oct-Nov;2018:3132-3142.

PMID:30775726
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6375489/
Abstract

Large multi-label datasets contain labels that occur thousands of times (frequent group), those that occur only a few times (few-shot group), and labels that never appear in the training dataset (zero-shot group). Multi-label few- and zero-shot label prediction is mostly unexplored on datasets with large label spaces, especially for text classification. In this paper, we perform a fine-grained evaluation to understand how state-of-the-art methods perform on infrequent labels. Furthermore, we develop few- and zero-shot methods for multi-label text classification when there is a known structure over the label space, and evaluate them on two publicly available medical text datasets: MIMIC II and MIMIC III. For few-shot labels we achieve improvements of 6.2% and 4.8% in R@10 for MIMIC II and MIMIC III, respectively, over prior efforts; the corresponding R@10 improvements for zero-shot labels are 17.3% and 19%.

摘要

大型多标签数据集包含出现数千次的标签(频繁组)、只出现几次的标签(少样本组)以及在训练数据集中从未出现的标签(零样本组)。在具有大标签空间的数据集上,尤其是对于文本分类,多标签少样本和零样本标签预测大多尚未得到充分探索。在本文中,我们进行了细粒度评估,以了解当前最先进的方法在不常见标签上的表现。此外,当标签空间存在已知结构时,我们开发了用于多标签文本分类的少样本和零样本方法,并在两个公开可用的医学文本数据集MIMIC II和MIMIC III上对其进行评估。对于少样本标签,与之前的工作相比,我们在MIMIC II和MIMIC III上的R@10分别提高了6.2%和4.8%;对于零样本标签,相应的R@10提高分别为17.3%和19%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/2608496e5f88/nihms-1008178-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/c8032a43d1e1/nihms-1008178-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/5d72a1251953/nihms-1008178-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/2608496e5f88/nihms-1008178-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/c8032a43d1e1/nihms-1008178-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/5d72a1251953/nihms-1008178-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd3b/6375489/2608496e5f88/nihms-1008178-f0003.jpg

相似文献

1
Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces.用于结构化标签空间的少样本和零样本多标签学习
Proc Conf Empir Methods Nat Lang Process. 2018 Oct-Nov;2018:3132-3142.
2
Multi-label zero-shot human action recognition via joint latent ranking embedding.基于联合潜在排序嵌入的多标签零镜头人体动作识别。
Neural Netw. 2020 Feb;122:1-23. doi: 10.1016/j.neunet.2019.09.029. Epub 2019 Oct 21.
3
Deep Ranking for Image Zero-Shot Multi-Label Classification.用于图像零样本多标签分类的深度排序
IEEE Trans Image Process. 2020 May 14. doi: 10.1109/TIP.2020.2991527.
4
Multi-label zero-shot learning with graph convolutional networks.基于图卷积网络的多标签零样本学习。
Neural Netw. 2020 Dec;132:333-341. doi: 10.1016/j.neunet.2020.09.010. Epub 2020 Sep 21.
5
Label-activating framework for zero-shot learning.标签激活框架用于零样本学习。
Neural Netw. 2020 Jan;121:1-9. doi: 10.1016/j.neunet.2019.08.023. Epub 2019 Sep 6.
6
Generative Multi-Label Zero-Shot Learning.生成式多标签零样本学习
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14611-14624. doi: 10.1109/TPAMI.2023.3295772. Epub 2023 Nov 3.
7
Few-Shot Learning Geometric Ensemble for Multi-label Classification of Chest X-Rays.用于胸部X光多标签分类的少样本学习几何集成
Data Augment Label Imperfections (2022). 2022 Sep;13567:112-122. doi: 10.1007/978-3-031-17027-0_12. Epub 2022 Sep 16.
8
Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt.基于提示的自回归生成式多标签少样本ICD编码
Proc AAAI Conf Artif Intell. 2023 Jun 26;37(4):5366-5374. doi: 10.1609/aaai.v37i4.25668.
9
A cross-modal deep metric learning model for disease diagnosis based on chest x-ray images.一种基于胸部X光图像的用于疾病诊断的跨模态深度度量学习模型。
Multimed Tools Appl. 2023 Mar 15:1-22. doi: 10.1007/s11042-023-14790-7.
10
A Joint Label Space for Generalized Zero-Shot Classification.用于广义零样本分类的联合标签空间
IEEE Trans Image Process. 2020 Apr 15. doi: 10.1109/TIP.2020.2986892.

引用本文的文献

1
Can GPT-3.5 generate and code discharge summaries?GPT-3.5 可以生成和编写出院小结吗?
J Am Med Inform Assoc. 2024 Oct 1;31(10):2284-2293. doi: 10.1093/jamia/ocae132.
2
Retrieval-Based Diagnostic Decision Support: Mixed Methods Study.基于检索的诊断决策支持:混合方法研究。
JMIR Med Inform. 2024 Jun 19;12:e50209. doi: 10.2196/50209.
3
Few-shot learning for medical text: A review of advances, trends, and opportunities.医学文本的少样本学习:进展、趋势和机遇综述。
J Biomed Inform. 2023 Aug;144:104458. doi: 10.1016/j.jbi.2023.104458. Epub 2023 Jul 23.
4
Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding.基于知识注入提示的多标签少样本ICD编码微调
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022:1767-1781.
5
Automating the overburdened clinical coding system: challenges and next steps.自动化负担过重的临床编码系统:挑战与后续步骤
NPJ Digit Med. 2023 Feb 3;6(1):16. doi: 10.1038/s41746-023-00768-0.
6
Learning from undercoded clinical records for automated International Classification of Diseases (ICD) coding.从未编码的临床记录中学习,实现自动化的国际疾病分类(ICD)编码。
J Am Med Inform Assoc. 2023 Feb 16;30(3):438-446. doi: 10.1093/jamia/ocac230.
7
Automated clinical coding: what, why, and where we are?自动化临床编码:是什么、为什么以及我们目前的进展?
NPJ Digit Med. 2022 Oct 22;5(1):159. doi: 10.1038/s41746-022-00705-7.
8
Labels in a haystack: Approaches beyond supervised learning in biomedical applications.大海捞针中的标签:生物医学应用中超越监督学习的方法。
Patterns (N Y). 2021 Dec 10;2(12):100383. doi: 10.1016/j.patter.2021.100383.
9
Improving natural language information extraction from cancer pathology reports using transfer learning and zero-shot string similarity.使用迁移学习和零样本字符串相似度改进从癌症病理报告中提取自然语言信息。
JAMIA Open. 2021 Sep 30;4(3):ooab085. doi: 10.1093/jamiaopen/ooab085. eCollection 2021 Jul.
10
Topics and Sentiments of Public Concerns Regarding COVID-19 Vaccines: Social Media Trend Analysis.公众对新冠疫苗关注的话题和情绪:社交媒体趋势分析。
J Med Internet Res. 2021 Oct 21;23(10):e30765. doi: 10.2196/30765.

本文引用的文献

1
EMR Coding with Semi-Parametric Multi-Head Matching Networks.基于半参数多头匹配网络的电子病历编码
Proc Conf. 2018 Jun;2018:2081-2091. doi: 10.18653/v1/N18-1189.
2
Large-scale online semantic indexing of biomedical articles via an ensemble of multi-label classification models.通过多标签分类模型集成对生物医学文章进行大规模在线语义索引。
J Biomed Semantics. 2017 Sep 22;8(1):43. doi: 10.1186/s13326-017-0150-0.
3
Analyzing the Moving Parts of a Large-Scale Multi-Label Text Classification Pipeline: Experiences in Indexing Biomedical Articles.分析大规模多标签文本分类管道的各个组成部分:生物医学文章索引编制的经验
Proc (IEEE Int Conf Healthc Inform). 2015 Oct;2015:1-7. doi: 10.1109/ICHI.2015.6. Epub 2015 Dec 10.
4
Rationale-Augmented Convolutional Neural Networks for Text Classification.用于文本分类的基于原理增强的卷积神经网络。
Proc Conf Empir Methods Nat Lang Process. 2016 Nov;2016:795-804. doi: 10.18653/v1/d16-1076.
5
MIMIC-III, a freely accessible critical care database.MIMIC-III,一个免费获取的重症监护数据库。
Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.
6
Label-Embedding for Image Classification.图像分类的标签嵌入。
IEEE Trans Pattern Anal Mach Intell. 2016 Jul;38(7):1425-38. doi: 10.1109/TPAMI.2015.2487986. Epub 2015 Oct 7.
7
Diagnosis code assignment: models and evaluation metrics.诊断码分配:模型和评估指标。
J Am Med Inform Assoc. 2014 Mar-Apr;21(2):231-7. doi: 10.1136/amiajnl-2013-002159. Epub 2013 Dec 2.
8
Automated classification of free-text pathology reports for registration of incident cases of cancer.用于癌症病例登记的自由文本病理报告的自动分类
Methods Inf Med. 2012;51(3):242-51. doi: 10.3414/ME11-01-0005. Epub 2011 Jul 26.
9
The Unified Medical Language System (UMLS): integrating biomedical terminology.统一医学语言系统(UMLS):整合生物医学术语。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70. doi: 10.1093/nar/gkh061.