Ahmed Yasmine, Telmer Cheryl A, Miskov-Zivanov Natasa
Electrical and Computer Engineering Department, University of Pittsburgh, Pittsburgh, PA 15213, USA.
Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Bioinform Adv. 2021 Jun 3;1(1):vbab006. doi: 10.1093/bioadv/vbab006. eCollection 2021.
Creating or extending computational models of complex systems, such as intra- and intercellular biological networks, is a time and labor-intensive task, often limited by the knowledge and experience of modelers. Automating this process would enable rapid, consistent, comprehensive and robust analysis and understanding of complex systems.
In this work, we present CLARINET (fying works), a novel methodology and a tool for automatically expanding models using the information extracted from the literature by machine reading. CLARINET creates collaboration graphs from the extracted events and uses several novel metrics for evaluating these events individually, in pairs, and in groups. These metrics are based on the frequency of occurrence and co-occurrence of events in literature, and their connectivity to the baseline model. We tested how well CLARINET can reproduce manually built and curated models, when provided with varying amount of information in the baseline model and in the machine reading output. Our results show that CLARINET can recover all relevant interactions that are present in the reading output and it automatically reconstructs manually built models with average recall of 80% and average precision of 70%. CLARINET is highly scalable, its average runtime is at the order of ten seconds when processing several thousand interactions, outperforming other similar methods.
The data underlying this article are available in Bitbucket at https://bitbucket.org/biodesignlab/clarinet/src/master/.
Supplementary data are available at online.
创建或扩展复杂系统的计算模型,如细胞内和细胞间生物网络,是一项耗时费力的任务,通常受到建模者知识和经验的限制。自动化这一过程将能够对复杂系统进行快速、一致、全面且稳健的分析和理解。
在这项工作中,我们提出了CLARINET(灵活工作),这是一种新颖的方法和工具,用于利用通过机器阅读从文献中提取的信息自动扩展模型。CLARINET从提取的事件创建协作图,并使用几种新颖的指标分别、成对和分组评估这些事件。这些指标基于文献中事件的出现频率和共现频率,以及它们与基线模型的连通性。当在基线模型和机器阅读输出中提供不同数量的信息时,我们测试了CLARINET能够多好地重现手动构建和策划的模型。我们的结果表明,CLARINET可以恢复阅读输出中存在的所有相关相互作用,并且它能够自动重建手动构建的模型,平均召回率为80%,平均精度为70%。CLARINET具有高度可扩展性,在处理数千个相互作用时,其平均运行时间约为十秒,优于其他类似方法。
本文所依据的数据可在Bitbucket上获取,网址为https://bitbucket.org/biodesignlab/clarinet/src/master/。
补充数据可在网上获取。