Wu Jingqiao, Feng Xiaoyue, Guan Renchu, Liang Yanchun
Zhuhai Sub Laboratory of Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, Zhuhai College of Jilin University, Zhuhai 519041, China.
Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China.
Entropy (Basel). 2021 Mar 12;23(3):338. doi: 10.3390/e23030338.
Machine learning models can automatically discover biomedical research trends and promote the dissemination of information and knowledge. Text feature representation is a critical and challenging task in natural language processing. Most methods of text feature representation are based on word representation. A good representation can capture semantic and structural information. In this paper, two fusion algorithms are proposed, namely, the Tr-W2v and Ti-W2v algorithms. They are based on the classical text feature representation model and consider the importance of words. The results show that the effectiveness of the two fusion text representation models is better than the classical text representation model, and the results based on the Tr-W2v algorithm are the best. Furthermore, based on the Tr-W2v algorithm, trend analyses of cancer research are conducted, including correlation analysis, keyword trend analysis, and improved keyword trend analysis. The discovery of the research trends and the evolution of hotspots for cancers can help doctors and biological researchers collect information and provide guidance for further research.
机器学习模型可以自动发现生物医学研究趋势,并促进信息和知识的传播。文本特征表示是自然语言处理中的一项关键且具有挑战性的任务。大多数文本特征表示方法都是基于词表示的。一个好的表示能够捕捉语义和结构信息。本文提出了两种融合算法,即Tr-W2v和Ti-W2v算法。它们基于经典的文本特征表示模型,并考虑了词的重要性。结果表明,这两种融合文本表示模型的有效性优于经典文本表示模型,且基于Tr-W2v算法的结果最佳。此外,基于Tr-W2v算法,对癌症研究进行了趋势分析,包括相关性分析、关键词趋势分析和改进的关键词趋势分析。癌症研究趋势的发现和热点的演变可以帮助医生和生物研究人员收集信息,并为进一步研究提供指导。