Suppr超能文献

从科学文章中提取实验参数实体。

Extracting experimental parameter entities from scientific articles.

作者信息

Farnsworth Steele, Gurdin Gabrielle, Vargas Jorge, Mulyar Andriy, Lewinski Nastassja, McInnes Bridget T

机构信息

Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.

Virginia Commonwealth University, 401 S. Main St., Richmond, VA 23284, USA.

出版信息

J Biomed Inform. 2022 Feb;126:103970. doi: 10.1016/j.jbi.2021.103970. Epub 2021 Dec 14.

Abstract

Systematic reviews are labor-intensive processes to combine all knowledge about a given topic into a coherent summary. Despite the high labor investment, they are necessary to create an exhaustive overview of current evidence relevant to a research question. In this work, we evaluate three state-of-the-art supervised multi-label sequence classification systems to automatically identify 24 different experimental design factors for the categories of Animal, Dose, Exposure, and Endpoint from journal articles describing the experiments related to toxicity and health effects of environmental agents. We then present an in depth analysis of the results evaluating the lexical diversity of the design parameters with respect to model performance, evaluating the impact of tokenization and non-contiguous mentions, and finally evaluating the dependencies between entities within the category entities. We demonstrate that in general, algorithms that use embedded representations of the sequences out-perform statistical algorithms, but that even these algorithms struggle with lexically diverse entities.

摘要

系统评价是将关于给定主题的所有知识整合为连贯总结的劳动密集型过程。尽管投入了大量人力,但它们对于全面概述与研究问题相关的当前证据是必要的。在这项工作中,我们评估了三种最先进的监督多标签序列分类系统,以从描述环境因子毒性和健康影响相关实验的期刊文章中自动识别动物、剂量、暴露和终点类别中的24种不同实验设计因素。然后,我们对结果进行深入分析,评估设计参数在词汇多样性方面对模型性能的影响,评估词元化和非连续提及的影响,最后评估类别实体中各实体之间的依赖性。我们证明,一般来说,使用序列嵌入表示的算法优于统计算法,但即使是这些算法在处理词汇多样的实体时也存在困难。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验