Dorr Ricardo A, Casal Juan José, Toriano Roxana
Laboratorio de Biomembranas, Instituto de Fisiología y Biofísica Bernardo Houssay (IFIBIO Houssay), Facultad de Medicina, Universidad de Buenos Aires-CONICET, Buenos Aires, Argentina. E-mail:
Laboratorio de Biomembranas, Instituto de Fisiología y Biofísica Bernardo Houssay (IFIBIO Houssay), Facultad de Medicina, Universidad de Buenos Aires-CONICET, Buenos Aires, Argentina.
Medicina (B Aires). 2021;81(2):214-223.
In the present work we use text mining as a treatment tool for a large scientific database, with the aim of obtaining new information about all the publications signed by Argentine authors and indexed until 2019, in the area of life sciences. More than 75 000 articles were analysed, published in around 5000 media, signed by about 186 000 authors with a workplace in Argentina or in collaborations with Argentine laboratories. Using automated tools that were developed ad hoc, the text of around 70 800 abstracts was analysed, seeking, through non-supervised digital detection, the main topics addressed by the authors, and the relationship with health problems in Argentina and their treatment. Results are also presented regarding the number of publications per year, the journals that have published them, and their authors and collaborations. These results, together with the predictions that were obtained, could become a useful tool to optimize the management of resources dedicated to basic and clinical research.
在本研究中,我们将文本挖掘用作处理大型科学数据库的工具,目的是获取有关截至2019年索引的、由阿根廷作者署名的生命科学领域所有出版物的新信息。分析了超过75000篇文章,这些文章发表在约5000种期刊上,由约186000名作者署名,这些作者的工作地点在阿根廷或与阿根廷实验室合作。使用专门开发的自动化工具,分析了约70800篇摘要的文本,通过无监督数字检测,寻找作者所涉及的主要主题,以及与阿根廷健康问题及其治疗的关系。还给出了每年出版物数量、发表这些出版物的期刊及其作者和合作情况的结果。这些结果,连同所获得的预测,可能成为优化用于基础和临床研究的资源管理的有用工具。