Ferreira Leila Maria, Sáfadi Thelma, Ferreira Juliano Lino
Programa de Pós-Graduação em Estatística e Experimentação Agropecuária, Departamento de Estatística, Universidade Federal de Lavras (UFLA), Lavras, MG, Brazil.
Departamento de Estatística, Universidade Federal de Lavras (UFLA), Lavras, MG, Brazil.
Genet Mol Biol. 2018 Oct-Dec;41(4):884-892. doi: 10.1590/1678-4685-GMB-2018-0035. Epub 2018 Nov 29.
We propose to evaluate genome similarity by combining discrete non-decimated wavelet transform (NDWT) and elastic net. The wavelets represent a signal with levels of detail, that is, hidden components are detected by means of the decomposition of this signal, where each level provides a different characteristic. The main feature of the elastic net is the grouping of correlated variables where the number of predictors is greater than the number of observations. The combination of these two methodologies applied in the clustering analysis of the Mycobacterium tuberculosis genome strains proved very effective, being able to identify clusters at each level of decomposition.
我们建议通过结合离散非抽取小波变换(NDWT)和弹性网络来评估基因组相似性。小波以细节层次表示信号,也就是说,通过该信号的分解来检测隐藏成分,其中每个层次都提供不同的特征。弹性网络的主要特征是对相关变量进行分组,其中预测变量的数量大于观测值的数量。在结核分枝杆菌基因组菌株的聚类分析中应用这两种方法的组合被证明非常有效,能够在每个分解层次上识别聚类。