• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

文本挖掘方法在处理 COVID-19 相关文献快速膨胀方面的应用。

Text mining approaches for dealing with the rapidly expanding literature on COVID-19.

机构信息

The Allen Institute for Artificial Intelligence, Seattle, WA 98112, USA.

出版信息

Brief Bioinform. 2021 Mar 22;22(2):781-799. doi: 10.1093/bib/bbaa296.

DOI:10.1093/bib/bbaa296
PMID:33279995
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7799291/
Abstract

More than 50 000 papers have been published about COVID-19 since the beginning of 2020 and several hundred new papers continue to be published every day. This incredible rate of scientific productivity leads to information overload, making it difficult for researchers, clinicians and public health officials to keep up with the latest findings. Automated text mining techniques for searching, reading and summarizing papers are helpful for addressing information overload. In this review, we describe the many resources that have been introduced to support text mining applications over the COVID-19 literature; specifically, we discuss the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19. We compile a list of 39 systems that provide functionality such as search, discovery, visualization and summarization over the COVID-19 literature. For each system, we provide a qualitative description and assessment of the system's performance, unique data or user interface features and modeling decisions. Many systems focus on search and discovery, though several systems provide novel features, such as the ability to summarize findings over multiple documents or linking between scientific articles and clinical trials. We also describe the public corpora, models and shared tasks that have been introduced to help reduce repeated effort among community members; some of these resources (especially shared tasks) can provide a basis for comparing the performance of different systems. Finally, we summarize promising results and open challenges for text mining the COVID-19 literature.

摘要

自 2020 年初以来,已经发表了超过 50000 篇关于 COVID-19 的论文,并且每天仍有数百篇新论文不断发表。这种令人难以置信的科学生产力导致了信息过载,使得研究人员、临床医生和公共卫生官员难以跟上最新的发现。用于搜索、阅读和总结论文的自动化文本挖掘技术有助于解决信息过载问题。在这篇综述中,我们描述了许多已经引入的资源,以支持 COVID-19 文献的文本挖掘应用;具体来说,我们讨论了为 COVID-19 引入的语料库、建模资源、系统和共享任务。我们编制了一份包含 39 个系统的列表,这些系统提供了针对 COVID-19 文献的搜索、发现、可视化和总结等功能。对于每个系统,我们提供了对系统性能、独特数据或用户界面功能以及建模决策的定性描述和评估。许多系统专注于搜索和发现,尽管有几个系统提供了新颖的功能,例如能够总结多个文档中的发现或在科学文章和临床试验之间建立链接。我们还描述了为帮助社区成员减少重复工作而引入的公共语料库、模型和共享任务;其中一些资源(特别是共享任务)可以为比较不同系统的性能提供基础。最后,我们总结了挖掘 COVID-19 文献的文本挖掘的有希望的结果和开放挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b359/7986598/1fe7550107b4/bbaa296f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b359/7986598/baad4c3d29ed/bbaa296f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b359/7986598/1fe7550107b4/bbaa296f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b359/7986598/baad4c3d29ed/bbaa296f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b359/7986598/1fe7550107b4/bbaa296f2.jpg

相似文献

1
Text mining approaches for dealing with the rapidly expanding literature on COVID-19.文本挖掘方法在处理 COVID-19 相关文献快速膨胀方面的应用。
Brief Bioinform. 2021 Mar 22;22(2):781-799. doi: 10.1093/bib/bbaa296.
2
Artificial Intelligence in Action: Addressing the COVID-19 Pandemic with Natural Language Processing.人工智能在行动:利用自然语言处理应对 COVID-19 大流行。
Annu Rev Biomed Data Sci. 2021 Jul 20;4:313-339. doi: 10.1146/annurev-biodatasci-021821-061045. Epub 2021 May 14.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
LitCovid: an open database of COVID-19 literature.LitCovid:一个 COVID-19 文献的开放数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D1534-D1540. doi: 10.1093/nar/gkaa952.
5
BioCreative III interactive task: an overview.BioCreative III 交互式任务概述。
BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S4. doi: 10.1186/1471-2105-12-S8-S4.
6
Biomedical Literature Mining and Its Components.生物医学文献挖掘及其组成部分。
Methods Mol Biol. 2022;2496:1-16. doi: 10.1007/978-1-0716-2305-3_1.
7
Using a Secure, Continually Updating, Web Source Processing Pipeline to Support the Real-Time Data Synthesis and Analysis of Scientific Literature: Development and Validation Study.使用安全、持续更新的网络源处理管道来支持科学文献的实时数据合成与分析:开发与验证研究。
J Med Internet Res. 2021 May 6;23(5):e25714. doi: 10.2196/25714.
8
Role of biological Data Mining and Machine Learning Techniques in Detecting and Diagnosing the Novel Coronavirus (COVID-19): A Systematic Review.生物数据挖掘和机器学习技术在检测和诊断新型冠状病毒 (COVID-19) 中的作用:系统评价。
J Med Syst. 2020 May 25;44(7):122. doi: 10.1007/s10916-020-01582-x.
9
Information Retrieval and Text Mining Technologies for Chemistry.化学信息检索与文本挖掘技术。
Chem Rev. 2017 Jun 28;117(12):7673-7761. doi: 10.1021/acs.chemrev.6b00851. Epub 2017 May 5.
10
Knowledge based word-concept model estimation and refinement for biomedical text mining.用于生物医学文本挖掘的基于知识的词概念模型估计与优化。
J Biomed Inform. 2015 Feb;53:300-7. doi: 10.1016/j.jbi.2014.11.015. Epub 2014 Dec 12.

引用本文的文献

1
Darling (v2.0): Mining disease-related databases for the detection of biomedical entity associations.达林(v2.0):挖掘疾病相关数据库以检测生物医学实体关联。
Comput Struct Biotechnol J. 2025 Jun 14;27:2626-2637. doi: 10.1016/j.csbj.2025.06.025. eCollection 2025.
2
Identification of potential hub genes and drugs in septic kidney injury: a bioinformatic analysis with preliminary experimental validation.脓毒症性肾损伤中潜在枢纽基因和药物的鉴定:一项具有初步实验验证的生物信息学分析
Front Med (Lausanne). 2025 Mar 17;12:1502189. doi: 10.3389/fmed.2025.1502189. eCollection 2025.
3
Artificial Intelligence in Medical Affairs: A New Paradigm with Novel Opportunities.

本文引用的文献

1
: Mapping and Browsing Medical Evidence in Real-Time.实时映射与浏览医学证据
Proc Conf. 2020 Jul;2020:63-69. doi: 10.18653/v1/2020.acl-demos.9.
2
COVID-19 information retrieval with deep-learning based semantic search, question answering, and abstractive summarization.基于深度学习的语义搜索、问答和摘要生成技术进行的COVID-19信息检索
NPJ Digit Med. 2021 Apr 12;4(1):68. doi: 10.1038/s41746-021-00437-0.
3
Tuberculosis and COVID-19: Lessons from the Past Viral Outbreaks and Possible Future Outcomes.结核病和 COVID-19:从过去的病毒爆发中吸取教训和可能的未来结果。
人工智能在医疗事务中的应用:一种具有新机遇的新模式。
Pharmaceut Med. 2024 Sep;38(5):331-342. doi: 10.1007/s40290-024-00536-9. Epub 2024 Sep 11.
4
Global Research on Pandemics or Epidemics and Mental Health: A Natural Language Processing Study.全球大流行病或传染病与心理健康研究:自然语言处理视角。
J Epidemiol Glob Health. 2024 Sep;14(3):1268-1280. doi: 10.1007/s44197-024-00284-8. Epub 2024 Aug 8.
5
Natural Language Processing in medicine and ophthalmology: A review for the 21st-century clinician.医学和眼科学中的自然语言处理:21 世纪临床医生的综述。
Asia Pac J Ophthalmol (Phila). 2024 Jul-Aug;13(4):100084. doi: 10.1016/j.apjo.2024.100084. Epub 2024 Jul 25.
6
PheSeq, a Bayesian deep learning model to enhance and interpret the gene-disease association studies.PheSeq,一种贝叶斯深度学习模型,用于增强和解释基因-疾病关联研究。
Genome Med. 2024 Apr 16;16(1):56. doi: 10.1186/s13073-024-01330-7.
7
The SAFE procedure: a practical stopping heuristic for active learning-based screening in systematic reviews and meta-analyses.SAFE 程序:一种实用的停止启发式方法,用于基于主动学习的系统评价和荟萃分析中的筛选。
Syst Rev. 2024 Mar 1;13(1):81. doi: 10.1186/s13643-024-02502-7.
8
Using Social Media to Help Understand Patient-Reported Health Outcomes of Post-COVID-19 Condition: Natural Language Processing Approach.利用社交媒体帮助了解新冠后症状患者报告的健康结果:自然语言处理方法。
J Med Internet Res. 2023 Sep 19;25:e45767. doi: 10.2196/45767.
9
BugSigDB captures patterns of differential abundance across a broad range of host-associated microbial signatures.BugSigDB 捕获了广泛宿主相关微生物特征的丰度差异模式。
Nat Biotechnol. 2024 May;42(5):790-802. doi: 10.1038/s41587-023-01872-y. Epub 2023 Sep 11.
10
A COVID-19 Search Engine (CO-SE) with Transformer-based architecture.一种基于Transformer架构的新冠病毒搜索引擎(CO-SE)。
Healthc Anal (N Y). 2022 Nov;2:100068. doi: 10.1016/j.health.2022.100068. Epub 2022 Jun 6.
Can Respir J. 2020 Sep 5;2020:1401053. doi: 10.1155/2020/1401053. eCollection 2020.
4
COVID-19 in 7780 pediatric patients: A systematic review.7780例儿科患者的新冠病毒病:一项系统评价
EClinicalMedicine. 2020 Jun 26;24:100433. doi: 10.1016/j.eclinm.2020.100433. eCollection 2020 Jul.
5
COVID-19 and Inflammatory Bowel Diseases: Risk Assessment, Shared Molecular Pathways, and Therapeutic Challenges.2019冠状病毒病与炎症性肠病:风险评估、共同分子途径及治疗挑战
Gastroenterol Res Pract. 2020 Jul 10;2020:1918035. doi: 10.1155/2020/1918035. eCollection 2020.
6
Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing.探索 SARS-CoV-2 病毒-宿主-药物相互作用组以进行药物再利用。
Nat Commun. 2020 Jul 14;11(1):3518. doi: 10.1038/s41467-020-17189-2.
7
Association of hypertension, diabetes, stroke, cancer, kidney disease, and high-cholesterol with COVID-19 disease severity and fatality: A systematic review.高血压、糖尿病、中风、癌症、肾脏疾病及高胆固醇与新冠肺炎疾病严重程度和死亡率的关联:一项系统综述
Diabetes Metab Syndr. 2020 Sep-Oct;14(5):1133-1142. doi: 10.1016/j.dsx.2020.07.005. Epub 2020 Jul 8.
8
Prevalence of Gastrointestinal Symptoms and Fecal Viral Shedding in Patients With Coronavirus Disease 2019: A Systematic Review and Meta-analysis.2019 年冠状病毒病患者的胃肠道症状和粪便病毒脱落的流行情况:系统评价和荟萃分析。
JAMA Netw Open. 2020 Jun 1;3(6):e2011335. doi: 10.1001/jamanetworkopen.2020.11335.
9
Constructing co-occurrence network embeddings to assist association extraction for COVID-19 and other coronavirus infectious diseases.构建共现网络嵌入以辅助 COVID-19 和其他冠状病毒传染病的关联提取。
J Am Med Inform Assoc. 2020 Aug 1;27(8):1259-1267. doi: 10.1093/jamia/ocaa117.
10
Clinical Characteristics of COVID-19 Infection in Newborns and Pediatrics: A Systematic Review.新生儿和儿童新冠病毒感染的临床特征:一项系统综述
Arch Acad Emerg Med. 2020 Apr 18;8(1):e50. eCollection 2020.