Dorr Ricardo A, Silberstein Claudia, Ibarra Cristina, Toriano Roxana
Universidad de Buenos Aires, CONICET, Instituto de Fisiología y Biofísica Bernardo Houssay (IFIBIO Houssay), Laboratorio de Biomembranas, Buenos Aires, Argentina.
Universidad de Buenos Aires, CONICET, IFIBIO Houssay, Laboratorio de Investigaciones en Fisiología Renal, Facultad de Medicina, Buenos Aires, Argentina.
Medicina (B Aires). 2022;82(4):513-524.
Hemolytic uremic syndrome (HUS) is characterized by thrombotic microangiopathy, hemolytic anemia, thrombocytopenia and acute renal failure. It can cause from permanent sequelae to death, mainly in children. In this work, using text mining (TM), we analyzed the explicit and implicit text of 16 192 original scientific articles on HUS indexed in the Europe PMC database. The objectives were to examine behaviors, track trends, and make predictions and cross-check data with other sources of information. For the analysis we used -among other computational tools- specially developed workflows (WF) in the KNIME platform. The TM on the words of the abstracts of the publications made it possible to: detect undescribed associations between events related to HUS; extract underlying information; make thematic clustering using unsupervised algorithms; make forecasting about the course of research associated with the topic. Both the approach and the WFs developed to perform Data Science on HUS can be applied to other biomedical topics and other scientific databases, making it possible to analyze relevant aspects in the field of human health to improve research, prevention and treatment of multiples diseases.
溶血尿毒综合征(HUS)的特征是血栓性微血管病、溶血性贫血、血小板减少和急性肾衰竭。它可导致从永久性后遗症到死亡的后果,主要发生在儿童中。在这项研究中,我们使用文本挖掘(TM)技术,分析了欧洲PMC数据库中索引的16192篇关于HUS的原始科学文章的显式和隐式文本。目的是研究行为、跟踪趋势、进行预测并与其他信息来源进行数据交叉核对。为了进行分析,我们除了使用其他计算工具外,还在KNIME平台上使用了专门开发的工作流程(WF)。对出版物摘要中的词汇进行文本挖掘使得我们能够:检测与HUS相关事件之间未描述的关联;提取潜在信息;使用无监督算法进行主题聚类;对与该主题相关的研究进程进行预测。所开发的用于对HUS进行数据科学分析的方法和工作流程均可应用于其他生物医学主题和其他科学数据库,从而有可能分析人类健康领域的相关方面,以改进多种疾病的研究、预防和治疗。