Fluck Juliane, Hofmann-Apitius Martin
Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany.
Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53754 Sankt Augustin, Germany; Bonn-Aachen International Center for Information Technology (B-IT), Dahlmannstraβe 2, 53113 Bonn, Germany.
Drug Discov Today. 2014 Feb;19(2):140-4. doi: 10.1016/j.drudis.2013.09.012. Epub 2013 Sep 23.
Scientific communication in biomedicine is, by and large, still text based. Text mining technologies for the automated extraction of useful biomedical information from unstructured text that can be directly used for systems biology modelling have been substantially improved over the past few years. In this review, we underline the importance of named entity recognition and relationship extraction as fundamental approaches that are relevant to systems biology. Furthermore, we emphasize the role of publicly organized scientific benchmarking challenges that reflect the current status of text-mining technology and are important in moving the entire field forward. Given further interdisciplinary development of systems biology-orientated ontologies and training corpora, we expect a steadily increasing impact of text-mining technology on systems biology in the future.
总体而言,生物医学领域的科学交流仍然主要基于文本。在过去几年中,用于从非结构化文本中自动提取可直接用于系统生物学建模的有用生物医学信息的文本挖掘技术有了显著改进。在本综述中,我们强调命名实体识别和关系提取作为与系统生物学相关的基本方法的重要性。此外,我们强调了公开组织的科学基准测试挑战所起的作用,这些挑战反映了文本挖掘技术的当前状态,并且对于推动整个领域的发展很重要。鉴于面向系统生物学的本体和训练语料库的进一步跨学科发展,我们预计文本挖掘技术在未来对系统生物学的影响将稳步增加。