Godavarthi Deepthi, A Mary Sowjanya
Dept. of CSSE, Andhra University College of Engineering (A), Visakhapatnam, AP, India.
Mater Today Proc. 2021 Feb 28. doi: 10.1016/j.matpr.2021.01.480.
Covid 19 pandemic has placed the entire world in a precarious condition. Earlier it was a serious issue in china whereas now it is being witnessed by citizens all over the world. Scientists are working hard to find treatment and vaccines for the coronavirus, also termed as covid. With the growing literature, it has become a major challenge for the medical community to find answers to questions related to covid-19. We have proposed a machine learning-based system that uses text classification applications of NLP to extract information from the scientific literature. Classification of large textual data makes the searching process easier thus useful for scientists. The main aim of our system is to classify the abstracts related to covid with their respective journals so that a researcher can refer to articles of his interest from the required journals instead of searching all the articles. In this paper, we describe our methodology needed to build such a system. Our system experiments on the COVID-19 open research dataset and the performance is evaluated using classifiers like KNN, MLP, etc. An explainer was also built using XGBoost to show the model predictions.
新冠疫情使整个世界陷入了不稳定状态。早些时候,这在中国是一个严重问题,而现在全世界的公民都见证着它。科学家们正在努力寻找针对冠状病毒(也称为新冠病毒)的治疗方法和疫苗。随着文献的不断增加,医学界要找到与新冠病毒相关问题的答案已成为一项重大挑战。我们提出了一种基于机器学习的系统,该系统使用自然语言处理的文本分类应用从科学文献中提取信息。对大量文本数据进行分类使搜索过程更加容易,因此对科学家很有用。我们系统的主要目标是将与新冠病毒相关的摘要及其各自的期刊进行分类,以便研究人员可以从所需期刊中查阅他感兴趣的文章,而不必搜索所有文章。在本文中,我们描述了构建这样一个系统所需的方法。我们的系统在新冠病毒开放研究数据集上进行实验,并使用KNN、MLP等分类器评估性能。还使用XGBoost构建了一个解释器来展示模型预测。