Key Laboratory of Advanced Process Control for Light Industry, Ministry of Education, Jiangnan University, Wuxi 214122, China.
Soil and Water Conservation Department, Yangtze River Scientific Research Institute, Wuhan 430010, China.
Sensors (Basel). 2021 Mar 1;21(5):1668. doi: 10.3390/s21051668.
Traditional co-word networks do not discriminate keywords of researcher interest from general keywords. Co-word networks are therefore often too general to provide knowledge if interest to domain experts. Inspired by the recent work that uses an automatic method to identify the questions of interest to researchers like "problems" and "solutions", we try to answer a similar question "what sensors can be used for what kind of applications", which is great interest in sensor- related fields. By generalizing the specific questions as "questions of interest", we built a knowledge network considering researcher interest, called bipartite network of interest (BNOI). Different from a co-word approaches using accurate keywords from a list, BNOI uses classification models to find possible entities of interest. A total of nine feature extraction methods including N-grams, Word2Vec, BERT, etc. were used to extract features to train the classification models, including naïve Bayes (NB), support vector machines (SVM) and logistic regression (LR). In addition, a multi-feature fusion strategy and a voting principle (VP) method are applied to assemble the capability of the features and the classification models. Using the abstract text data of 350 remote sensing articles, features are extracted and the models trained. The experiment results show that after removing the biased words and using the ten-fold cross-validation method, the F-measure of "sensors" and "applications" are 93.2% and 85.5%, respectively. It is thus demonstrated that researcher questions of interest can be better answered by the constructed BNOI based on classification results, comparedwith the traditional co-word network approach.
传统的共词网络无法区分研究人员感兴趣的关键词和一般关键词。因此,共词网络通常过于笼统,无法为领域专家提供感兴趣的知识。受最近使用自动方法识别研究人员感兴趣的问题(如“问题”和“解决方案”)的工作的启发,我们试图回答一个类似的问题“哪些传感器可用于哪种应用”,这在传感器相关领域非常感兴趣。通过将具体问题概括为“感兴趣的问题”,我们构建了一个考虑研究人员兴趣的知识网络,称为有兴趣的二分网络(BNOI)。与使用列表中的准确关键词的共词方法不同,BNOI 使用分类模型来查找可能感兴趣的实体。总共使用了九种特征提取方法,包括 N 元组、Word2Vec、BERT 等,以提取特征来训练分类模型,包括朴素贝叶斯(NB)、支持向量机(SVM)和逻辑回归(LR)。此外,还应用了多特征融合策略和投票原则(VP)方法来组合特征和分类模型的能力。使用 350 篇遥感文章的摘要文本数据提取特征并进行模型训练。实验结果表明,在去除有偏的词并使用十折交叉验证方法后,“传感器”和“应用”的 F 度量分别为 93.2%和 85.5%。因此,与传统的共词网络方法相比,基于分类结果构建的 BNOI 可以更好地回答研究人员感兴趣的问题。