Imtiaz Tashifa, Nanayakkara Jina, Fang Alexis, Jomaa Danny, Mayotte Harrison, Damiani Simona, Javed Fiza, Jones Tristan, Kaczmarek Emily, Adebayo Flourish Omolara, Imtiaz Uroosa, Li Yiheng, Zhang Richard, Mousavi Parvin, Renwick Neil, Tyryshkin Kathrin
Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, 88 Stuart St, Kingston, ON K7L 3N6, Canada.
Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, 88 Stuart St, Kingston, ON K7L 3N6, Canada.
STAR Protoc. 2023 Oct 27;4(4):102661. doi: 10.1016/j.xpro.2023.102661.
RNA-based sample discrimination and classification can be used to provide biological insights and/or distinguish between clinical groups. However, finding informative differences between sample groups can be challenging due to the multidimensional and noisy nature of sequencing data. Here, we apply a machine learning approach for hierarchical discrimination and classification of samples with high-dimensional miRNA expression data. Our protocol comprises data preprocessing, unsupervised learning, feature selection, and machine-learning-based hierarchical classification, alongside open-source MATLAB code.
基于RNA的样本鉴别和分类可用于提供生物学见解和/或区分临床组。然而,由于测序数据具有多维度和噪声的特性,在样本组之间找到信息性差异可能具有挑战性。在这里,我们应用一种机器学习方法,对具有高维miRNA表达数据的样本进行分层鉴别和分类。我们的方案包括数据预处理、无监督学习、特征选择和基于机器学习的分层分类,以及开源的MATLAB代码。