Department of Mathematics and Computer Science, University of Ferrara, Italy.
Artif Intell Med. 2023 Mar;137:102486. doi: 10.1016/j.artmed.2022.102486. Epub 2023 Feb 4.
Symbolic learning is the logic-based approach to machine learning, and its mission is to provide algorithms and methodologies to extract logical information from data and express it in an interpretable way. Interval temporal logic has been recently proposed as a suitable tool for symbolic learning, specifically via the design of an interval temporal logic decision tree extraction algorithm. In order to improve their performances, interval temporal decision trees can be embedded into interval temporal random forests, mimicking the corresponding schema at the propositional level. In this article we consider a dataset of cough and breath sample recordings of volunteer subjects, labeled with their COVID-19 status, originally collected by the University of Cambridge. By interpreting such recordings as multivariate time series, we study the problem of their automated classification using interval temporal decision trees and forests. While this problem has been approached with the same dataset as well as with other datasets, in all cases, non-symbolic learning methods (usually, deep learning-based) have been applied to solve it; in this article we apply a symbolic approach, and show that it does not only outperform the state-of-the-art obtained with the same dataset, but its results are also superior to those of most non-symbolic techniques applied on other datasets. As an added bonus, thanks to the symbolic nature of our approach, we are also able to extract explicit knowledge to help physicians characterize typical COVID-positive cough and breath.
符号学习是一种基于逻辑的机器学习方法,其任务是提供算法和方法,从数据中提取逻辑信息,并以可解释的方式表达出来。区间时间逻辑最近被提出作为符号学习的合适工具,特别是通过设计区间时间逻辑决策树提取算法。为了提高它们的性能,可以将区间时间决策树嵌入到区间时间随机森林中,在命题级别模拟相应的模式。在本文中,我们考虑了一个由志愿者咳嗽和呼吸样本记录组成的数据集,这些记录标记了他们的 COVID-19 状态,最初是由剑桥大学收集的。通过将这些记录解释为多元时间序列,我们研究了使用区间时间决策树和森林对其进行自动分类的问题。虽然这个问题已经使用相同的数据集以及其他数据集进行了研究,但在所有情况下,都应用了非符号学习方法(通常是基于深度学习的方法)来解决它;在本文中,我们应用了一种符号方法,并表明它不仅优于使用相同数据集获得的最新技术,而且其结果也优于应用于其他数据集的大多数非符号技术。此外,由于我们方法的符号性质,我们还能够提取明确的知识,以帮助医生描述典型的 COVID-19 阳性咳嗽和呼吸。