Renganathan Vinaitheerthan
Head of Institutional Research, Skyline University College, Sharjah, UAE.
Healthc Inform Res. 2017 Jul;23(3):141-146. doi: 10.4258/hir.2017.23.3.141. Epub 2017 Jul 31.
With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents.
This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain.
Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail.
Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.
随着生物医学领域每年发表的文章数量呈指数级增长,有必要构建自动化系统来从已发表的文章中提取未知信息。文本挖掘技术能够从未结构化文档中提取未知知识。
本文详细回顾了文本挖掘过程以及可用于进行文本挖掘的软件工具。它还回顾了文本挖掘在生物医学领域的作用和应用。
详细描述了文本挖掘过程,如文档的搜索与检索、文档预处理、自然语言处理、文本聚类方法和文本分类方法。
文本挖掘技术有助于从已发表的生物医学研究文章中挖掘关于特定主题的大量知识,并得出用其他方式无法得出的有意义的结论。