Environmental Health and Ecological Science Department, Ifakara Health Institute, P.O. Box 53, Ifakara, Tanzania.
School of Life Science and Bioengineering, The Nelson Mandela African Institute of Science and Technology, P.O. Box 447, Arusha, Tanzania.
Parasit Vectors. 2022 Aug 6;15(1):281. doi: 10.1186/s13071-022-05396-3.
Monitoring the biological attributes of mosquitoes is critical for understanding pathogen transmission and estimating the impacts of vector control interventions on the survival of vector species. Infrared spectroscopy and machine learning techniques are increasingly being tested for this purpose and have been proven to accurately predict the age, species, blood-meal sources, and pathogen infections in Anopheles and Aedes mosquitoes. However, as these techniques are still in early-stage implementation, there are no standardized procedures for handling samples prior to the infrared scanning. This study investigated the effects of different preservation methods and storage duration on the performance of mid-infrared spectroscopy for age-grading females of the malaria vector, Anopheles arabiensis.
Laboratory-reared An. arabiensis (N = 3681) were collected at 5 and 17 days post-emergence, killed with ethanol, and then preserved using silica desiccant at 5 °C, freezing at - 20 °C, or absolute ethanol at room temperature. For each preservation method, the mosquitoes were divided into three groups, stored for 1, 4, or 8 weeks, and then scanned using a mid-infrared spectrometer. Supervised machine learning classifiers were trained with the infrared spectra, and the support vector machine (SVM) emerged as the best model for predicting the mosquito ages.
The model trained using silica-preserved mosquitoes achieved 95% accuracy when predicting the ages of other silica-preserved mosquitoes, but declined to 72% and 66% when age-classifying mosquitoes preserved using ethanol and freezing, respectively. Prediction accuracies of models trained on samples preserved in ethanol and freezing also reduced when these models were applied to samples preserved by other methods. Similarly, models trained on 1-week stored samples had declining accuracies of 97%, 83%, and 72% when predicting the ages of mosquitoes stored for 1, 4, or 8 weeks respectively.
When using mid-infrared spectroscopy and supervised machine learning to age-grade mosquitoes, the highest accuracies are achieved when the training and test samples are preserved in the same way and stored for similar durations. However, when the test and training samples were handled differently, the classification accuracies declined significantly. Protocols for infrared-based entomological studies should therefore emphasize standardized sample-handling procedures and possibly additional statistical procedures such as transfer learning for greater accuracy.
监测蚊子的生物学属性对于了解病原体传播和估计媒介控制干预措施对媒介物种生存的影响至关重要。为此,越来越多的研究人员正在测试红外光谱和机器学习技术,这些技术已被证明可准确预测按蚊和伊蚊的年龄、种类、血餐来源和病原体感染情况。然而,由于这些技术仍处于早期实施阶段,因此在进行红外扫描之前,尚无针对样本处理的标准化程序。本研究调查了不同的保存方法和储存时间对中红外光谱进行年龄分级的效果,以评估疟疾媒介按蚊(Anopheles arabiensis)的雌性。
实验室饲养的按蚊(An. arabiensis)(N=3681)在孵出后 5 天和 17 天收集,用乙醇杀死,然后用硅胶干燥剂在 5°C 下保存,在-20°C 下冷冻或在室温下用无水乙醇保存。对于每种保存方法,将蚊子分为三组,分别储存 1、4 或 8 周,然后用中红外光谱仪扫描。用红外光谱对有监督的机器学习分类器进行训练,支持向量机(SVM)成为预测蚊子年龄的最佳模型。
用硅胶保存的蚊子训练的模型,在预测其他硅胶保存的蚊子的年龄时准确率达到 95%,但当用乙醇和冷冻保存的蚊子时,准确率分别下降到 72%和 66%。用乙醇和冷冻保存的样本训练的模型应用于其他方法保存的样本时,预测准确率也会降低。同样,用 1 周储存的样本训练的模型,在预测储存 1、4 或 8 周的蚊子的年龄时,准确率分别下降到 97%、83%和 72%。
在使用中红外光谱和有监督的机器学习对蚊子进行年龄分级时,当训练和测试样本以相同的方式保存并保存相似的时间时,可获得最高的准确性。然而,当测试和训练样本的处理方式不同时,分类准确性显著下降。因此,基于红外的昆虫学研究的方案应强调标准化的样本处理程序,并可能需要采用转移学习等额外的统计程序以提高准确性。