School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, P. R. China.
Am J Chin Med. 2022;50(1):91-131. doi: 10.1142/S0192415X22500045. Epub 2021 Dec 2.
Machine learning (ML), as a branch of artificial intelligence, acquires the potential and meaningful rules from the mass of data via diverse algorithms. Owing to all research of traditional Chinese medicine (TCM) belonging to the digitalization of clinical records or experimental works, a massive and complex amount of data has become an inextricable part of the related studies. It is thus not surprising that ML approaches, as novel and efficient tools to mine the useful knowledge from data, have created inroads in a diversity of scopes of TCM over the past decade of years. However, by browsing lots of literature, we find that not all of the ML approaches perform well in the same field. Upon further consideration, we infer that the specificity may inhere between the ML approaches and their applied fields. This systematic review focuses its attention on the four categories of ML approaches and their eight application scopes in TCM. According to the function, ML approaches are classified into four categories, including classification, regression, clustering, and dimensionality reduction, and into 14 models as follows in more detail: support vector machine, least square-support vector machine, logistic regression, partial least squares regression, k-means clustering, hierarchical cluster analysis, artificial neural network, back propagation neural network, convolutional neural network, decision tree, random forest, principal component analysis, partial least squares-discriminant analysis, and orthogonal partial least squares-discriminant analysis. The eight common applied fields are divided into two parts: one for TCM, such as the diagnosis of diseases, the determination of syndromes, and the analysis of prescription, and the other for the related researches of Chinese herbal medicine, such as the quality control, the identification of geographic origins, the pharmacodynamic material basis, the medicinal properties, and the pharmacokinetics and pharmacodynamics. Additionally, this paper discusses the function and feature difference among ML approaches when they are applied to the corresponding fields via comparing their principles. The specificity of each approach to its applied fields has also been affirmed, whereby laying a foundation for subsequent studies applying ML approaches to TCM.
机器学习(ML)作为人工智能的一个分支,通过各种算法从大量数据中获取潜在的有意义的规则。由于传统中医(TCM)的所有研究都属于临床记录或实验工作的数字化,因此大量复杂的数据已经成为相关研究不可分割的一部分。因此,ML 方法作为从数据中挖掘有用知识的新颖而有效的工具,在过去十年中在 TCM 的多个领域取得了突破,这并不奇怪。然而,通过浏览大量文献,我们发现并非所有 ML 方法在同一领域都表现良好。经过进一步考虑,我们推断这种特异性可能存在于 ML 方法与其应用领域之间。本系统评价重点关注 ML 方法的四类及其在 TCM 的八个应用领域。根据功能,ML 方法分为四类,包括分类、回归、聚类和降维,并进一步细分为以下 14 种模型:支持向量机、最小二乘支持向量机、逻辑回归、偏最小二乘回归、k-均值聚类、层次聚类分析、人工神经网络、反向传播神经网络、卷积神经网络、决策树、随机森林、主成分分析、偏最小二乘判别分析和正交偏最小二乘判别分析。八个常见的应用领域分为两部分:一部分是 TCM,如疾病诊断、证候确定和处方分析,另一部分是中药相关研究,如质量控制、地理起源鉴定、药效物质基础、药性和药代动力学和药效学。此外,本文还通过比较其原理,讨论了 ML 方法应用于相应领域时的功能和特征差异。还肯定了每种方法对其应用领域的特异性,为后续应用 ML 方法研究 TCM 奠定了基础。