从科学文章中提取实验参数实体。

Extracting experimental parameter entities from scientific articles.

作者信息

Farnsworth Steele, Gurdin Gabrielle, Vargas Jorge, Mulyar Andriy, Lewinski Nastassja, McInnes Bridget T

机构信息

Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.

出版信息

J Biomed Inform. 2022 Feb;126:103970. doi: 10.1016/j.jbi.2021.103970. Epub 2021 Dec 14.

DOI:10.1016/j.jbi.2021.103970

PMID:34920128

Abstract

Systematic reviews are labor-intensive processes to combine all knowledge about a given topic into a coherent summary. Despite the high labor investment, they are necessary to create an exhaustive overview of current evidence relevant to a research question. In this work, we evaluate three state-of-the-art supervised multi-label sequence classification systems to automatically identify 24 different experimental design factors for the categories of Animal, Dose, Exposure, and Endpoint from journal articles describing the experiments related to toxicity and health effects of environmental agents. We then present an in depth analysis of the results evaluating the lexical diversity of the design parameters with respect to model performance, evaluating the impact of tokenization and non-contiguous mentions, and finally evaluating the dependencies between entities within the category entities. We demonstrate that in general, algorithms that use embedded representations of the sequences out-perform statistical algorithms, but that even these algorithms struggle with lexically diverse entities.

摘要

系统评价是将关于给定主题的所有知识整合为连贯总结的劳动密集型过程。尽管投入了大量人力，但它们对于全面概述与研究问题相关的当前证据是必要的。在这项工作中，我们评估了三种最先进的监督多标签序列分类系统，以从描述环境因子毒性和健康影响相关实验的期刊文章中自动识别动物、剂量、暴露和终点类别中的24种不同实验设计因素。然后，我们对结果进行深入分析，评估设计参数在词汇多样性方面对模型性能的影响，评估词元化和非连续提及的影响，最后评估类别实体中各实体之间的依赖性。我们证明，一般来说，使用序列嵌入表示的算法优于统计算法，但即使是这些算法在处理词汇多样的实体时也存在困难。

相似文献

Extracting experimental parameter entities from scientific articles.从科学文章中提取实验参数实体。

J Biomed Inform. 2022 Feb;126:103970. doi: 10.1016/j.jbi.2021.103970. Epub 2021 Dec 14.

Automatic endpoint detection to support the systematic review process.支持系统评价过程的自动终点检测。

J Biomed Inform. 2015 Aug;56:42-56. doi: 10.1016/j.jbi.2015.05.004. Epub 2015 May 21.

Measuring the effect of different types of unsupervised word representations on Medical Named Entity Recognition.测量不同类型无监督词表示方法对医学命名实体识别的影响。

Int J Med Inform. 2019 Sep;129:100-106. doi: 10.1016/j.ijmedinf.2019.05.022. Epub 2019 Jun 5.

Supporting the working life exposome: Annotating occupational exposure for enhanced literature search.支持工作生活外显子组：对职业暴露进行注释以增强文献检索。

PLoS One. 2024 Aug 15;19(8):e0307844. doi: 10.1371/journal.pone.0307844. eCollection 2024.

Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象：化学与物理邂逅生物学（瑞士阿斯科纳，2012年6月10日至14日）

Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.

Assessment of disease named entity recognition on a corpus of annotated sentences.基于带注释句子语料库的疾病命名实体识别评估。

BMC Bioinformatics. 2008 Apr 11;9 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2105-9-S3-S3.

Unsupervised biomedical named entity recognition: experiments with clinical and biological texts.无监督生物医学命名实体识别：临床和生物文本实验。

J Biomed Inform. 2013 Dec;46(6):1088-98. doi: 10.1016/j.jbi.2013.08.004. Epub 2013 Aug 15.

Feature selection techniques for maximum entropy based biomedical named entity recognition.基于最大熵的生物医学命名实体识别的特征选择技术。

J Biomed Inform. 2009 Oct;42(5):905-11. doi: 10.1016/j.jbi.2008.12.012. Epub 2009 Jan 23.

Character-level neural network for biomedical named entity recognition.用于生物医学命名实体识别的字符级神经网络。

J Biomed Inform. 2017 Jun;70:85-91. doi: 10.1016/j.jbi.2017.05.002. Epub 2017 May 11.

Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles.应用监督机器学习从神经科学研究文章中提取脑连接信息。

Interdiscip Sci. 2021 Dec;13(4):731-750. doi: 10.1007/s12539-021-00443-6. Epub 2021 Jun 2.

引用本文的文献

Comparing generative and extractive approaches to information extraction from abstracts describing randomized clinical trials.比较从描述随机临床试验的摘要中提取信息的生成式方法和抽取式方法。

J Biomed Semantics. 2024 Apr 23;15(1):3. doi: 10.1186/s13326-024-00305-2.

Towards Environment-Aware Fall Risk Assessment: Classifying Walking Surface Conditions Using IMU-Based Gait Data and Deep Learning.迈向环境感知跌倒风险评估：利用基于惯性测量单元的步态数据和深度学习对行走路面状况进行分类

Brain Sci. 2023 Oct 8;13(10):1428. doi: 10.3390/brainsci13101428.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从科学文章中提取实验参数实体。

Extracting experimental parameter entities from scientific articles.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献