Suppr超能文献

文献综述:基因组学和系统生物学的文本挖掘。

What the papers say: text mining for genomics and systems biology.

机构信息

Division of Molecular Biosciences, Centre for Bioinformatics, Imperial College London, 303, Wolfson Building, South Kensington Campus, London, SW7 2AZ, UK.

出版信息

Hum Genomics. 2010 Oct;5(1):17-29. doi: 10.1186/1479-7364-5-1-17.

Abstract

Keeping up with the rapidly growing literature has become virtually impossible for most scientists. This can have dire consequences. First, we may waste research time and resources on reinventing the wheel simply because we can no longer maintain a reliable grasp on the published literature. Second, and perhaps more detrimental, judicious (or serendipitous) combination of knowledge from different scientific disciplines, which would require following disparate and distinct research literatures, is rapidly becoming impossible for even the most ardent readers of research publications. Text mining - the automated extraction of information from (electronically) published sources - could potentially fulfil an important role - but only if we know how to harness its strengths and overcome its weaknesses. As we do not expect that the rate at which scientific results are published will decrease, text mining tools are now becoming essential in order to cope with, and derive maximum benefit from, this information explosion. In genomics, this is particularly pressing as more and more rare disease-causing variants are found and need to be understood. Not being conversant with this technology may put scientists and biomedical regulators at a severe disadvantage. In this review, we introduce the basic concepts underlying modern text mining and its applications in genomics and systems biology. We hope that this review will serve three purposes: (i) to provide a timely and useful overview of the current status of this field, including a survey of present challenges; (ii) to enable researchers to decide how and when to apply text mining tools in their own research; and (iii) to highlight how the research communities in genomics and systems biology can help to make text mining from biomedical abstracts and texts more straightforward.

摘要

对于大多数科学家来说,跟上快速增长的文献已经变得几乎不可能。这可能会产生可怕的后果。首先,我们可能会因为无法再可靠地掌握已发表的文献,而浪费研究时间和资源在重新发明轮子上。其次,更糟糕的是,明智(或偶然)地结合来自不同科学学科的知识,这需要跟踪不同的和独特的研究文献,即使是最热衷于阅读研究出版物的读者,也变得越来越不可能。文本挖掘——从(电子)已发表来源中自动提取信息——可能会发挥重要作用——但前提是我们知道如何利用其优势并克服其弱点。由于我们预计科学成果的发表速度不会降低,因此文本挖掘工具现在已成为应对和从信息爆炸中获得最大收益的必要手段。在基因组学中,这一点尤为紧迫,因为越来越多的罕见致病变体被发现并需要被理解。不熟悉这项技术可能会使科学家和生物医学监管机构处于严重劣势。在这篇综述中,我们介绍了现代文本挖掘的基本概念及其在基因组学和系统生物学中的应用。我们希望这篇综述能达到三个目的:(i)及时、有效地概述该领域的现状,包括对当前挑战的调查;(ii)使研究人员能够决定如何以及何时在自己的研究中应用文本挖掘工具;(iii)强调基因组学和系统生物学研究界如何帮助使生物医学摘要和文本的文本挖掘更加简单。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f473/3500154/5cd5ee97bc8e/1479-7364-5-1-17-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验