Daoud Mosaab
Independent Research Scientist, Toronto, ON M1S1G2, Canada.
Genomics Inform. 2020 Mar;18(1):e2. doi: 10.5808/GI.2020.18.1.e2. Epub 2020 Mar 31.
In this paper, we propose a new approach to detecting outliers in a set of segmented genomes of the flu virus, a data set with a heterogeneous set of sequences. The approach has the following computational phases: feature extraction, which is a mapping into feature space, alignment-free distance measure to measure the distance between any two segmented genomes, and a mapping into distance space to analyze a quantum of distance values. The approach is implemented using supervised and unsupervised learning modes. The experiments show robustness in detecting outliers of the segmented genome of the flu virus.
在本文中,我们提出了一种新方法,用于在一组流感病毒分段基因组(一个具有异构序列集的数据集)中检测异常值。该方法具有以下计算阶段:特征提取,即将其映射到特征空间;无比对距离度量,用于测量任意两个分段基因组之间的距离;以及映射到距离空间以分析一定数量的距离值。该方法通过监督学习和无监督学习模式实现。实验表明,该方法在检测流感病毒分段基因组的异常值方面具有鲁棒性。