• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于非标准化表格的大语言模型驱动的可转移关键信息提取机制

Large language model driven transferable key information extraction mechanism for nonstandardized tables.

作者信息

Hu Rong, Yang Ye, Liu Sen, Li Zuchen, Liu Jingyi, Ding Xingchen, Sun Hanchi, Ren Lingli

机构信息

Customs and Public Management College, Shanghai Customs University, Shanghai, 201204, China.

School of Electronic Information, Shanghai DianJi University, Shanghai, 201306, China.

出版信息

Sci Rep. 2025 Aug 14;15(1):29802. doi: 10.1038/s41598-025-15627-z.

DOI:10.1038/s41598-025-15627-z
PMID:40813619
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12354842/
Abstract

Extracting key information from unstructured tables poses significant challenges due to layout variability, dependence on large annotated datasets, and inability of existing methods to directly output structured formats like JSON. These limitations hinder scalability and generalization to unseen document formats. We propose the Large Language Model Driven Transferable Key Information Extraction Mechanism (LLM-TKIE), which employs text detection to identify relevant regions in document images, followed by text recognition to extract content. An LLM then performs semantic reasoning, including completeness verification and key information extraction, before organizing data into structured formats. Without fine-tuning, LLM-TKIE achieves an F1-score of 80.9 and tree edit distance-based accuracy of 88.85 on CORD, and an F1-score of 83.9 with 93.3 accuracy on SROIE, demonstrating robust generalization and structural precision. Notably, our method significantly outperforms state-of-the-art multimodal large models on unlabeled customs domain datasets by 5-8% in accuracy. Additionally, our evaluation of multiple large language models of various sizes across 15 quantization strategies provides valuable insights for selecting and optimizing LLMs for key information extraction tasks, offering practical guidance for system development.

摘要

由于布局的可变性、对大量标注数据集的依赖以及现有方法无法直接输出如JSON等结构化格式,从非结构化表格中提取关键信息面临重大挑战。这些限制阻碍了可扩展性以及对未见文档格式的泛化能力。我们提出了大语言模型驱动的可转移关键信息提取机制(LLM-TKIE),该机制利用文本检测来识别文档图像中的相关区域,随后通过文本识别来提取内容。然后,一个大语言模型进行语义推理,包括完整性验证和关键信息提取,再将数据组织成结构化格式。无需微调,LLM-TKIE在CORD数据集上的F1分数达到80.9,基于树编辑距离的准确率达到88.85,在SROIE数据集上的F1分数为83.9,准确率为93.3,展示了强大的泛化能力和结构精度。值得注意的是,我们的方法在未标记的海关领域数据集上的准确率比最先进的多模态大模型显著高出5-8%。此外,我们对15种量化策略下不同规模的多个大语言模型进行的评估,为关键信息提取任务选择和优化大语言模型提供了有价值的见解,为系统开发提供了实用指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/bbcc07120069/41598_2025_15627_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/bcca3b56fcb7/41598_2025_15627_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/c6241f220c9d/41598_2025_15627_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/49db9eef2b35/41598_2025_15627_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/7a648d88497b/41598_2025_15627_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/cce6221df860/41598_2025_15627_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/10b002d3bd3f/41598_2025_15627_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/540ab61fc0b2/41598_2025_15627_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/9fb0cf9081ab/41598_2025_15627_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/80e74d3559b1/41598_2025_15627_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/1a4ceed51744/41598_2025_15627_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/03aa06d9ac66/41598_2025_15627_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/a8cb27c1512a/41598_2025_15627_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/7c48a69c8975/41598_2025_15627_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/bbcc07120069/41598_2025_15627_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/bcca3b56fcb7/41598_2025_15627_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/c6241f220c9d/41598_2025_15627_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/49db9eef2b35/41598_2025_15627_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/7a648d88497b/41598_2025_15627_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/cce6221df860/41598_2025_15627_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/10b002d3bd3f/41598_2025_15627_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/540ab61fc0b2/41598_2025_15627_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/9fb0cf9081ab/41598_2025_15627_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/80e74d3559b1/41598_2025_15627_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/1a4ceed51744/41598_2025_15627_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/03aa06d9ac66/41598_2025_15627_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/a8cb27c1512a/41598_2025_15627_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/7c48a69c8975/41598_2025_15627_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/12354842/bbcc07120069/41598_2025_15627_Fig13_HTML.jpg

相似文献

1
Large language model driven transferable key information extraction mechanism for nonstandardized tables.用于非标准化表格的大语言模型驱动的可转移关键信息提取机制
Sci Rep. 2025 Aug 14;15(1):29802. doi: 10.1038/s41598-025-15627-z.
2
Automated Extraction of Mortality Information From Publicly Available Sources Using Large Language Models: Development and Evaluation Study.使用大语言模型从公开可用来源自动提取死亡率信息:开发与评估研究
J Med Internet Res. 2025 Aug 18;27:e71113. doi: 10.2196/71113.
3
Extracting epilepsy-related information from unstructured clinic letters using large language models.使用大语言模型从非结构化临床信件中提取癫痫相关信息。
Epilepsia. 2025 Jul 10. doi: 10.1111/epi.18475.
4
Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测:基于放射学报告的多中心方法学研究
J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.
5
Utilizing large language models for detecting hospital-acquired conditions: an empirical study on pulmonary embolism.利用大语言模型检测医院获得性疾病:关于肺栓塞的实证研究
J Am Med Inform Assoc. 2025 May 1;32(5):876-884. doi: 10.1093/jamia/ocaf048.
6
Large Language Model Symptom Identification From Clinical Text: Multicenter Study.基于临床文本的大语言模型症状识别:多中心研究。
J Med Internet Res. 2025 Jul 31;27:e72984. doi: 10.2196/72984.
7
Improving unified information extraction in Chinese mental health domain with instruction-tuned LLMs and type-verification component.使用指令微调的语言模型和类型验证组件改进中文心理健康领域的统一信息提取
Artif Intell Med. 2025 Apr;162:103087. doi: 10.1016/j.artmed.2025.103087. Epub 2025 Feb 19.
8
Improving automated deep phenotyping through large language models using retrieval-augmented generation.通过使用检索增强生成的大语言模型改进自动化深度表型分析。
Genome Med. 2025 Aug 18;17(1):91. doi: 10.1186/s13073-025-01521-w.
9
Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study.将医学知识图谱融入大语言模型进行诊断预测:设计与应用研究
JMIR AI. 2025 Feb 24;4:e58670. doi: 10.2196/58670.
10
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险

本文引用的文献

1
Extracting accurate materials data from research papers with conversational language models and prompt engineering.利用对话式语言模型和提示工程从研究论文中提取准确的材料数据。
Nat Commun. 2024 Feb 21;15(1):1569. doi: 10.1038/s41467-024-45914-8.
2
How to use large language models in ophthalmology: from prompt engineering to protecting confidentiality.如何在眼科领域使用大语言模型:从提示工程到保密保护
Eye (Lond). 2024 Mar;38(4):649-653. doi: 10.1038/s41433-023-02772-w. Epub 2023 Oct 5.
3
A knowledge graph based question answering method for medical domain.
一种基于知识图谱的医学领域问答方法。
PeerJ Comput Sci. 2021 Sep 1;7:e667. doi: 10.7717/peerj-cs.667. eCollection 2021.