• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分析生物医学文献补充材料中基于文本文件的信息含量。

Analyzing the Information Content of Text-Based Files in Supplementary Materials of Biomedical Literature.

机构信息

University of Applied Sciences and Arts of Western Switzerland (HES-SO).

Swiss Institute of Bioinformatics, Switzerland.

出版信息

Stud Health Technol Inform. 2022 May 25;294:876-877. doi: 10.3233/SHTI220614.

DOI:10.3233/SHTI220614
PMID:35612233
Abstract

We present an analysis of supplementary materials of PubMed Central (PMC) articles and show their importance in indexing and searching biomedical literature, in particular for the emerging genomic medicine field. On a subset of articles from PubMed Central, we use text mining methods to extract MeSH terms from abstracts, full texts, and text-based supplementary materials. We find that the recall of MeSH annotations increases by about 5.9 percentage points (+20% on relative percentage) when considering supplementary materials compared to using only abstracts. We further compare the supplementary material annotations with full-text annotations and we find out that the recall of MeSH terms increases by 1.5 percentage point (+3% on relative percentage). Additionally, we analyze genetic variant mentions in abstracts and full-texts and compare them with mentions found in supplementary text-based files. We find that the majority (about 99%) of variants are found in text-based supplementary files. In conclusion, we suggest that supplementary data should receive more attention from the information retrieval community, in particular in life and health sciences.

摘要

我们分析了 PubMed Central(PMC)文章的补充材料,并展示了它们在索引和搜索生物医学文献方面的重要性,特别是在新兴的基因组医学领域。在 PMC 的一部分文章中,我们使用文本挖掘方法从摘要、全文和基于文本的补充材料中提取 MeSH 术语。我们发现,与仅使用摘要相比,考虑补充材料时,MeSH 注释的召回率增加了约 5.9 个百分点(相对百分比增加 20%)。我们进一步比较了补充材料的注释和全文的注释,发现 MeSH 术语的召回率增加了 1.5 个百分点(相对百分比增加 3%)。此外,我们分析了摘要和全文中的遗传变异提及,并将其与在补充基于文本的文件中发现的提及进行了比较。我们发现,约 99%的变异是在基于文本的补充文件中发现的。总之,我们建议补充数据应得到信息检索界的更多关注,特别是在生命和健康科学领域。

相似文献

1
Analyzing the Information Content of Text-Based Files in Supplementary Materials of Biomedical Literature.分析生物医学文献补充材料中基于文本文件的信息含量。
Stud Health Technol Inform. 2022 May 25;294:876-877. doi: 10.3233/SHTI220614.
2
FullMeSH: improving large-scale MeSH indexing with full text.全文 MeSH:利用全文提高大规模 MeSH 标引的质量。
Bioinformatics. 2020 Mar 1;36(5):1533-1541. doi: 10.1093/bioinformatics/btz756.
3
BERTMeSH: deep contextual representation learning for large-scale high-performance MeSH indexing with full text.BERTMeSH:基于深度上下文表示学习的大规模高性能 MeSH 索引与全文检索
Bioinformatics. 2021 May 5;37(5):684-692. doi: 10.1093/bioinformatics/btaa837.
4
Meshable: searching PubMed abstracts by utilizing MeSH and MeSH-derived topical terms.可网格化:利用医学主题词表(MeSH)及其衍生主题词搜索PubMed摘要。
Bioinformatics. 2016 Oct 1;32(19):3044-6. doi: 10.1093/bioinformatics/btw331. Epub 2016 Jun 10.
5
Literature mining of genetic variants for curation: quantifying the importance of supplementary material.用于整理的基因变异文献挖掘:量化补充材料的重要性。
Database (Oxford). 2014 Feb 10;2014:bau003. doi: 10.1093/database/bau003. Print 2014.
6
Mining locus tags in PubMed Central to improve microbial gene annotation.从 PubMed Central 中挖掘基因座标签以改进微生物基因注释。
BMC Bioinformatics. 2014 Feb 5;15:43. doi: 10.1186/1471-2105-15-43.
7
PubTator central: automated concept annotation for biomedical full text articles.PubTator 中心:用于生物医学全文文章的自动概念标注。
Nucleic Acids Res. 2019 Jul 2;47(W1):W587-W593. doi: 10.1093/nar/gkz389.
8
NCBI disease corpus: a resource for disease name recognition and concept normalization.NCBI疾病语料库:一种用于疾病名称识别和概念规范化的资源。
J Biomed Inform. 2014 Feb;47:1-10. doi: 10.1016/j.jbi.2013.12.006. Epub 2014 Jan 3.
9
Ontology-based Brucella vaccine literature indexing and systematic analysis of gene-vaccine association network.基于本体论的布鲁氏菌疫苗文献标引及基因-疫苗关联网络的系统分析。
BMC Immunol. 2011 Aug 26;12:49. doi: 10.1186/1471-2172-12-49.
10
Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.文本挖掘有助于数据库管理——从生物医学文献中提取突变与疾病的关联。
BMC Bioinformatics. 2015 Jun 6;16:185. doi: 10.1186/s12859-015-0609-x.

引用本文的文献

1
Unlocking the potential of PubMed Central supplementary data files.挖掘PubMed Central补充数据文件的潜力。
Bioinform Adv. 2025 Jun 27;5(1):vbaf155. doi: 10.1093/bioadv/vbaf155. eCollection 2025.
2
Tracking genetic variants in the biomedical literature using LitVar 2.0.使用LitVar 2.0在生物医学文献中追踪基因变异。
Nat Genet. 2023 Jun;55(6):901-903. doi: 10.1038/s41588-023-01414-x.
3
Assessing the use of supplementary materials to improve genomic variant discovery.评估使用补充材料来提高基因组变异发现的效果。
Database (Oxford). 2023 Mar 31;2023. doi: 10.1093/database/baad017.
4
COVoc and COVTriage: novel resources to support literature triage.COVoc 和 COVTriage:支持文献分类的新资源。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac800.