Suppr超能文献

第十六章:转化生物信息学中的文本挖掘。

Chapter 16: text mining for translational bioinformatics.

机构信息

Computational Bioscience Program, University of Colorado School of Medicine, Aurora, Colorado, USA.

出版信息

PLoS Comput Biol. 2013 Apr;9(4):e1003044. doi: 10.1371/journal.pcbi.1003044. Epub 2013 Apr 25.

Abstract

Text mining for translational bioinformatics is a new field with tremendous research potential. It is a subfield of biomedical natural language processing that concerns itself directly with the problem of relating basic biomedical research to clinical practice, and vice versa. Applications of text mining fall both into the category of T1 translational research-translating basic science results into new interventions-and T2 translational research, or translational research for public health. Potential use cases include better phenotyping of research subjects, and pharmacogenomic research. A variety of methods for evaluating text mining applications exist, including corpora, structured test suites, and post hoc judging. Two basic principles of linguistic structure are relevant for building text mining applications. One is that linguistic structure consists of multiple levels. The other is that every level of linguistic structure is characterized by ambiguity. There are two basic approaches to text mining: rule-based, also known as knowledge-based; and machine-learning-based, also known as statistical. Many systems are hybrids of the two approaches. Shared tasks have had a strong effect on the direction of the field. Like all translational bioinformatics software, text mining software for translational bioinformatics can be considered health-critical and should be subject to the strictest standards of quality assurance and software testing.

摘要

文本挖掘在转化生物信息学中是一个具有巨大研究潜力的新领域。它是生物医学自然语言处理的一个子领域,直接关注将基础生物医学研究与临床实践联系起来的问题,反之亦然。文本挖掘的应用既属于转化研究 1(将基础科学成果转化为新的干预措施),也属于转化研究 2(或转化为公共卫生的研究)。潜在的用例包括更好地对研究对象进行表型分析,以及药物基因组学研究。评估文本挖掘应用的方法有很多种,包括语料库、结构化测试套件和事后判断。构建文本挖掘应用程序与两个基本的语言结构原则相关。一个是语言结构由多个层次组成。另一个是语言结构的每个层次都具有模糊性。文本挖掘有两种基本方法:基于规则的,也称为基于知识的;以及基于机器学习的,也称为基于统计的。许多系统是这两种方法的混合。共享任务对该领域的发展方向产生了重大影响。与所有转化生物信息学软件一样,转化生物信息学的文本挖掘软件可以被认为是对健康至关重要的,应该遵守最严格的质量保证和软件测试标准。

相似文献

1
Chapter 16: text mining for translational bioinformatics.第十六章:转化生物信息学中的文本挖掘。
PLoS Comput Biol. 2013 Apr;9(4):e1003044. doi: 10.1371/journal.pcbi.1003044. Epub 2013 Apr 25.
4
A survey on annotation tools for the biomedical literature.一份关于生物医学文献注释工具的调查。
Brief Bioinform. 2014 Mar;15(2):327-40. doi: 10.1093/bib/bbs084. Epub 2012 Dec 18.
5
A Guide to Dictionary-Based Text Mining.基于词典的文本挖掘指南。
Methods Mol Biol. 2019;1939:73-89. doi: 10.1007/978-1-4939-9089-4_5.
6
Survey of Natural Language Processing Techniques in Bioinformatics.生物信息学中的自然语言处理技术综述
Comput Math Methods Med. 2015;2015:674296. doi: 10.1155/2015/674296. Epub 2015 Oct 7.
9
Introducing Machine Learning Concepts with WEKA.使用WEKA介绍机器学习概念。
Methods Mol Biol. 2016;1418:353-78. doi: 10.1007/978-1-4939-3578-9_17.
10
Biomarker identification using text mining.使用文本挖掘进行生物标志物识别。
Comput Math Methods Med. 2012;2012:135780. doi: 10.1155/2012/135780. Epub 2012 Nov 11.

引用本文的文献

本文引用的文献

2
2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.2010 i2b2/VA 挑战赛:临床文本中的概念、断言和关系
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):552-6. doi: 10.1136/amiajnl-2011-000203. Epub 2011 Jun 16.
5
Extracting medication information from clinical text.从临床文本中提取药物信息。
J Am Med Inform Assoc. 2010 Sep-Oct;17(5):514-8. doi: 10.1136/jamia.2010.003947.
6
Exploring species-based strategies for gene normalization.探索基于物种的基因标准化策略。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Jul-Sep;7(3):462-71. doi: 10.1109/TCBB.2010.48.
7
An overview of MetaMap: historical perspective and recent advances.MetaMap 概述:历史视角与最新进展。
J Am Med Inform Assoc. 2010 May-Jun;17(3):229-36. doi: 10.1136/jamia.2009.002733.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验