Department of Plant Agriculture, Gosling Research Institute for Plant Preservation, University of Guelph, Guelph, ON, Canada.
Department of Botany, University of British Columbia, Vancouver, BC, Canada.
PLoS One. 2020 Sep 30;15(9):e0239901. doi: 10.1371/journal.pone.0239901. eCollection 2020.
Optimizing the gene transformation factors can be considered as the first and foremost step in successful genetic engineering and genome editing studies. However, it is usually difficult to achieve an optimized gene transformation protocol due to the cost and time-consuming as well as the complexity of this process. Therefore, it is necessary to use a novel computational approach such as machine learning models for analyzing gene transformation data. In the current study, three individual machine learning models including Multi-Layer Perceptron (MLP), Adaptive Neuro-Fuzzy Inference System (ANFIS), and Radial Basis Function (RBF) were developed for forecasting Agrobacterium-mediated gene transformation in chrysanthemum based on eleven input variables including Agrobacterium strain, optical density (OD), co-culture period (CCP), and different antibiotics including kanamycin (K), vancomycin (VA), cefotaxime (CF), hygromycin (H), carbenicillin (CA), geneticin (G), ticarcillin (TI), and paromomycin (P). Consequently, best-obtained results were used in the fusion process by bagging method. Results showed that ensemble model with the highest R2 (0.83) had superb performance in comparison with all other individual models (MLP:063, RBF:0.69, and ANFIS: 0.74) in the validation set. Also, ensemble model was linked to Fruit fly optimization algorithm (FOA) for optimizing gene transformation, and the results showed that the maximum gene transformation efficiency (37.54%) can be achieved from EHA105 strain with 0.9 OD600, for 3.8 days CCP, 46.43 mg/l P, 9.54 mg/l K, 18.62 mg/l H, and 4.79 mg/l G as selection antibiotics and 109.74 μg/ml VA, 287.63 μg/ml CF, 334.07 μg/ml CA and 87.36 μg/ml TI as antibiotics in the selection medium. Moreover, sensitivity analysis demonstrated that input variables have a different degree of importance in gene transformation system in the order of Agrobacterium strain > CCP > K > CF > VA > P > OD > CA > H > TI > G. Generally, the developed hybrid model in this study (ensemble model-FOA) can be employed as an accurate and reliable approach in future genetic engineering and genome editing studies.
优化基因转化因子可以被认为是成功进行基因工程和基因组编辑研究的首要步骤。然而,由于该过程的成本高、耗时且复杂,通常很难实现优化的基因转化方案。因此,有必要使用机器学习模型等新型计算方法来分析基因转化数据。在本研究中,开发了三个独立的机器学习模型,包括多层感知器(MLP)、自适应神经模糊推理系统(ANFIS)和径向基函数(RBF),用于基于 11 个输入变量(包括农杆菌菌株、光密度(OD)、共培养期(CCP)和不同抗生素,如卡那霉素(K)、万古霉素(VA)、头孢噻肟(CF)、潮霉素(H)、羧苄青霉素(CA)、遗传霉素(G)、替卡西林(TI)和巴龙霉素(P))预测菊花中的农杆菌介导的基因转化。然后,使用袋装法将最佳结果融合到融合过程中。结果表明,与所有其他单个模型(MLP:0.63、RBF:0.69 和 ANFIS:0.74)相比,验证集中集成模型的 R2 最高(0.83),性能出色。此外,将集成模型与果蝇优化算法(FOA)链接起来,用于优化基因转化,结果表明,EHA105 菌株的最大基因转化效率(37.54%)可以通过 0.9 OD600、3.8 天 CCP、46.43 mg/l P、9.54 mg/l K、18.62 mg/l H 和 4.79 mg/l G 作为选择抗生素,以及 109.74μg/ml VA、287.63μg/ml CF、334.07μg/ml CA 和 87.36μg/ml TI 作为选择培养基中的抗生素来实现。此外,敏感性分析表明,输入变量在基因转化系统中的重要程度不同,其顺序为农杆菌菌株>CCP>K>CF>VA>P>OD>CA>H>TI>G。总的来说,本研究中开发的混合模型(集成模型-FOA)可以在未来的基因工程和基因组编辑研究中作为一种准确可靠的方法。