BMC Med Genomics. 2013;6 Suppl 3(Suppl 3):S2. doi: 10.1186/1755-8794-6-S3-S2. Epub 2013 Nov 11.
In recent years, both single-nucleotide polymorphism (SNP) array and functional magnetic resonance imaging (fMRI) have been widely used for the study of schizophrenia (SCZ). In addition, a few studies have been reported integrating both SNPs data and fMRI data for comprehensive analysis.
In this study, a novel sparse representation based variable selection (SRVS) method has been proposed and tested on a simulation data set to demonstrate its multi-resolution properties. Then the SRVS method was applied to an integrative analysis of two different SCZ data sets, a Single-nucleotide polymorphism (SNP) data set and a functional resonance imaging (fMRI) data set, including 92 cases and 116 controls. Biomarkers for the disease were identified and validated with a multivariate classification approach followed by a leave one out (LOO) cross-validation. Then we compared the results with that of a previously reported sparse representation based feature selection method.
Results showed that biomarkers from our proposed SRVS method gave significantly higher classification accuracy in discriminating SCZ patients from healthy controls than that of the previous reported sparse representation method. Furthermore, using biomarkers from both data sets led to better classification accuracy than using single type of biomarkers, which suggests the advantage of integrative analysis of different types of data.
The proposed SRVS algorithm is effective in identifying significant biomarkers for complicated disease as SCZ. Integrating different types of data (e.g. SNP and fMRI data) may identify complementary biomarkers benefitting the diagnosis accuracy of the disease.
近年来,单核苷酸多态性 (SNP) 阵列和功能磁共振成像 (fMRI) 都已广泛用于精神分裂症 (SCZ) 的研究。此外,已有一些研究报道了整合 SNP 数据和 fMRI 数据进行综合分析的方法。
本研究提出了一种新的基于稀疏表示的变量选择 (SRVS) 方法,并在模拟数据集上进行了测试,以证明其多分辨率特性。然后,将 SRVS 方法应用于两个不同的 SCZ 数据集(SNP 数据集和功能磁共振成像 (fMRI) 数据集)的综合分析,包括 92 例病例和 116 例对照。采用多元分类方法识别和验证疾病的生物标志物,然后进行留一法 (LOO) 交叉验证。然后,我们将结果与之前报道的基于稀疏表示的特征选择方法进行了比较。
结果表明,与之前报道的稀疏表示方法相比,我们提出的 SRVS 方法的生物标志物在区分 SCZ 患者和健康对照者方面具有更高的分类准确率。此外,使用来自两种类型数据的生物标志物比使用单一类型的生物标志物具有更好的分类准确率,这表明了整合不同类型数据进行综合分析的优势。
所提出的 SRVS 算法在识别复杂疾病(如 SCZ)的显著生物标志物方面是有效的。整合不同类型的数据(例如 SNP 和 fMRI 数据)可能会识别出互补的生物标志物,从而提高疾病的诊断准确率。