Dang Lihong, Li Jian, Bai Xue, Liu Mingfeng, Li Na, Ren Kang, Cao Jie, Du Qiuxiang, Sun Junhong
School of Forensic Medicine, Shanxi Medical University, 98 University Street, Yuci District, Jinzhong 030604, China.
Diagnostics (Basel). 2023 Jan 20;13(3):395. doi: 10.3390/diagnostics13030395.
(1) Background: Accurate diagnosis of wound age is crucial for investigating violent cases in forensic practice. However, effective biomarkers and forecast methods are lacking. (2) Methods: Samples were collected from rats divided randomly into control and contusion groups at 0, 4, 8, 12, 16, 20, and 24 h post-injury. The characteristics of concern were nine mRNA expression levels. Internal validation data were used to train different machine learning algorithms, namely random forest (RF), support vector machine (SVM), multilayer perceptron (MLP), gradient boosting (GB), and stochastic gradient descent (SGD), to predict wound age. These models were considered the base learners, which were then applied to developing 26 stacking ensemble models combining two, three, four, or five base learners. The best-performing stacking model and base learner were evaluated through external validation data. (3) Results: The best results were obtained using a stacking model of RF + SVM + MLP (accuracy = 92.85%, area under the receiver operating characteristic curve (AUROC) = 0.93, root-mean-square-error (RMSE) = 1.06 h). The wound age prediction performance of the stacking models was also confirmed for another independent dataset. (4) Conclusions: We illustrate that machine learning techniques, especially ensemble algorithms, have a high potential to be used to predict wound age. According to the results, the strategy can be applied to other types of forensic forecasts.
(1) 背景:在法医实践中,准确诊断伤口形成时间对于调查暴力案件至关重要。然而,目前缺乏有效的生物标志物和预测方法。(2) 方法:将大鼠随机分为对照组和挫伤组,于伤后0、4、8、12、16、20和24小时采集样本。关注的特征为9种mRNA表达水平。利用内部验证数据训练不同的机器学习算法,即随机森林(RF)、支持向量机(SVM)、多层感知器(MLP)、梯度提升(GB)和随机梯度下降(SGD),以预测伤口形成时间。这些模型被视为基础学习器,然后用于构建26个堆叠集成模型,这些模型由两个、三个、四个或五个基础学习器组合而成。通过外部验证数据评估表现最佳的堆叠模型和基础学习器。(3) 结果:使用RF + SVM + MLP的堆叠模型获得了最佳结果(准确率 = 92.85%,受试者工作特征曲线下面积(AUROC) = 0.93,均方根误差(RMSE) = 1.06小时)。另一个独立数据集也证实了堆叠模型在伤口形成时间预测方面的性能。(4) 结论:我们证明机器学习技术,尤其是集成算法,在预测伤口形成时间方面具有很高的应用潜力。根据结果,该策略可应用于其他类型的法医预测。