Suppr超能文献

一种基于多标签深度森林并利用生物医学文本挖掘的中医辨证模型。

A syndrome differentiation model of TCM based on multi-label deep forest using biomedical text mining.

作者信息

Gong Lejun, Jiang Jindou, Chen Shiqi, Qi Mingming

机构信息

Jiangsu Key Lab of Big Data Security and Intelligent Processing, School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China.

Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, Nanjing, China.

出版信息

Front Genet. 2023 Oct 3;14:1272016. doi: 10.3389/fgene.2023.1272016. eCollection 2023.

Abstract

Syndrome differentiation and treatment is the basic principle of traditional Chinese medicine (TCM) to recognize and treat diseases. Accurate syndrome differentiation can provide a reliable basis for treatment, therefore, establishing a scientific intelligent syndrome differentiation method is of great significance to the modernization of TCM. With the development of biomdical text mining technology, TCM has entered the era of intelligence that based on data, and model training increasingly relies on the large-scale labeled data. However, it is difficult to form a large standard data set in the field of TCM due to the low degree of standardization of TCM data collection and the privacy protection of patients' medical records. To solve the above problem, a multi-label deep forest model based on an improved multi-label ReliefF feature selection algorithm, ML-PRDF, is proposed to enhance the representativeness of features within the model, express the original information with fewer features, and achieve optimal classification accuracy, while alleviating the problem of high data processing cost of deep forest models and achieving effective TCM discriminative analysis under small samples. The results show that the proposed model finally outperforms other multi-label classification models in terms of multi-label evaluation criteria, and has higher accuracy in the TCM syndrome differentiation problem compared with the traditional multi-label deep forest, and the comparative study shows that the use of PCC-MLRF algorithm for feature selection can better select representative features.

摘要

辨证论治是中医认识和治疗疾病的基本原则。准确的辨证能为治疗提供可靠依据,因此,建立科学的智能辨证方法对中医现代化具有重要意义。随着生物医学文本挖掘技术的发展,中医进入了基于数据的智能时代,模型训练越来越依赖大规模标注数据。然而,由于中医数据采集的标准化程度低以及患者病历的隐私保护问题,在中医领域难以形成大规模标准数据集。为解决上述问题,提出了一种基于改进的多标签ReliefF特征选择算法的多标签深度森林模型ML-PRDF,以增强模型内特征的代表性,用更少的特征表达原始信息,实现最优分类精度,同时缓解深度森林模型数据处理成本高的问题,并在小样本下实现有效的中医判别分析。结果表明,所提模型最终在多标签评估标准方面优于其他多标签分类模型,与传统多标签深度森林相比,在中医辨证问题上具有更高的准确率,对比研究表明,使用PCC-MLRF算法进行特征选择能更好地选择具有代表性的特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13de/10579813/84418217bfff/fgene-14-1272016-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验