National Centre for Text Mining, School of Computer Science, University of Manchester, Manchester M1 7DN, UK.
Bioinformatics. 2018 Apr 15;34(8):1389-1397. doi: 10.1093/bioinformatics/btx774.
Pathway models are valuable resources that help us understand the various mechanisms underpinning complex biological processes. Their curation is typically carried out through manual inspection of published scientific literature to find information relevant to a model, which is a laborious and knowledge-intensive task. Furthermore, models curated manually cannot be easily updated and maintained with new evidence extracted from the literature without automated support.
We have developed LitPathExplorer, a visual text analytics tool that integrates advanced text mining, semi-supervised learning and interactive visualization, to facilitate the exploration and analysis of pathway models using statements (i.e. events) extracted automatically from the literature and organized according to levels of confidence. LitPathExplorer supports pathway modellers and curators alike by: (i) extracting events from the literature that corroborate existing models with evidence; (ii) discovering new events which can update models; and (iii) providing a confidence value for each event that is automatically computed based on linguistic features and article metadata. Our evaluation of event extraction showed a precision of 89% and a recall of 71%. Evaluation of our confidence measure, when used for ranking sampled events, showed an average precision ranging between 61 and 73%, which can be improved to 95% when the user is involved in the semi-supervised learning process. Qualitative evaluation using pair analytics based on the feedback of three domain experts confirmed the utility of our tool within the context of pathway model exploration.
LitPathExplorer is available at http://nactem.ac.uk/LitPathExplorer_BI/.
sophia.ananiadou@manchester.ac.uk.
Supplementary data are available at Bioinformatics online.
途径模型是有价值的资源,可帮助我们了解复杂生物过程背后的各种机制。模型的编纂通常是通过手动检查已发表的科学文献来查找与模型相关的信息来完成的,这是一项费力且需要丰富知识的任务。此外,如果没有自动化支持,手动编纂的模型将无法轻松地根据从文献中提取的新证据进行更新和维护。
我们开发了 LitPathExplorer,这是一种可视化文本分析工具,它集成了高级文本挖掘、半监督学习和交互式可视化,以使用从文献中自动提取并根据置信度级别组织的语句(即事件)来促进途径模型的探索和分析。LitPathExplorer 通过以下方式支持途径建模者和编纂者:(i)从文献中提取与现有模型相符的证据来支持模型;(ii)发现可更新模型的新事件;(iii)为每个事件提供置信值,该值是根据语言特征和文章元数据自动计算的。我们对事件提取的评估显示,精度为 89%,召回率为 71%。当用于对采样事件进行排名时,我们对置信度度量的评估表明,平均精度在 61%到 73%之间,如果用户参与半监督学习过程,则可以将精度提高到 95%。基于三位领域专家的反馈进行的基于配对分析的定性评估证实了我们的工具在途径模型探索方面的实用性。
LitPathExplorer 可在 http://nactem.ac.uk/LitPathExplorer_BI/ 上获得。
sophia.ananiadou@manchester.ac.uk。
补充数据可在《生物信息学》在线获取。