Harris Guy M, Abbas Shahroze, Miles Michael F
VCU Pharmacology and Toxicology, Richmond, Virginia, 23298, USA.
VCU Center for the Study of Biological Complexity, Richmond, Virginia, 23298, USA.
BMC Genomics. 2021 Feb 1;22(1):96. doi: 10.1186/s12864-021-07370-2.
Despite the increasing use of RNAseq for transcriptome analysis, microarrays remain a widely-used methodology for genomic studies. The latest generation of Affymetrix/Thermo-Fisher microarrays, the ClariomD/XTA and ClariomS array, provide a sensitive and facile method for complex transcriptome expression analysis. However, existing methods of analysis for these high-density arrays do not leverage the statistical power contained in having multiple oligonucleotides representing each gene/exon, but rather summarize probes into a single expression value. We previously developed a methodology, the Sscore algorithm, for probe-level identification of differentially expressed genes (DEGs) between treatment and control samples with oligonucleotide microarrays. The Sscore algorithm was validated for sensitive detection of DEGs by comparison with existing methods. However, the prior version of the Sscore algorithm and a R-based implementation software, sscore, do not function with the latest generations of Affymetrix/Fisher microarrays due to changes in microarray design that eliminated probes previously used for estimation of non-specific binding.
Here we describe the GCSscore algorithm, which utilizes the GC-content of a given oligonucleotide probe to estimate non-specific binding using antigenomic background probes found on new generations of arrays. We implemented this algorithm in an improved GCSscore R package for analysis of modern oligonucleotide microarrays. GCSscore has multiple methods for grouping individual probes on the ClariomD/XTA chips, providing the user with differential expression analysis at the gene-level and the exon-level. By utilizing the direct probe-level intensities, the GCSscore algorithm was able to detect DEGs under stringent statistical criteria for all Clariom-based arrays. We demonstrate that for older 3'-IVT arrays, GCSscore produced very similar differential gene expression analysis results compared to the original Sscore method. However, GCSscore functioned well for both the ClariomS and ClariomD/XTA newer microarrays and outperformed existing analysis approaches insofar as the number of DEGs and cognate biological functions identified. This was particularly striking for analysis of the highly complex ClariomD/XTA based arrays.
The GCSscore package represents a powerful new application for analysis of the newest generation of oligonucleotide microarrays such as the ClariomS and ClariomD/XTA arrays produced by Affymetrix/Fisher.
尽管RNA测序在转录组分析中的应用日益增加,但微阵列仍然是基因组研究中广泛使用的方法。最新一代的Affymetrix/Thermo-Fisher微阵列,即ClariomD/XTA和ClariomS阵列,为复杂的转录组表达分析提供了一种灵敏且简便的方法。然而,现有的针对这些高密度阵列的分析方法并未利用多个代表每个基因/外显子的寡核苷酸所蕴含的统计功效,而是将探针汇总为单个表达值。我们之前开发了一种方法,即Sscore算法,用于通过寡核苷酸微阵列在处理组和对照组样本之间进行差异表达基因(DEG)的探针水平鉴定。通过与现有方法比较,Sscore算法在灵敏检测DEG方面得到了验证。然而,由于微阵列设计的变化消除了先前用于估计非特异性结合的探针,Sscore算法的先前版本以及基于R的实现软件sscore无法与最新一代的Affymetrix/Fisher微阵列兼容。
在此我们描述了GCSscore算法,该算法利用给定寡核苷酸探针的GC含量,通过新一代阵列上发现的反基因组背景探针来估计非特异性结合。我们在一个改进的GCSscore R包中实现了此算法,用于分析现代寡核苷酸微阵列。GCSscore有多种方法对ClariomD/XTA芯片上的单个探针进行分组,为用户提供基因水平和外显子水平的差异表达分析。通过利用直接的探针水平强度,GCSscore算法能够在严格的统计标准下检测所有基于Clariom的阵列中的DEG。我们证明,对于较旧的3'-IVT阵列,与原始的Sscore方法相比,GCSscore产生了非常相似的差异基因表达分析结果。然而,GCSscore在ClariomS和ClariomD/XTA更新的微阵列上均运行良好,并且在鉴定出的DEG数量和相关生物学功能方面优于现有分析方法。对于基于高度复杂的ClariomD/XTA的阵列分析而言,这一点尤为显著。
GCSscore包代表了一种强大的新应用,用于分析最新一代的寡核苷酸微阵列,如Affymetrix/Fisher生产的ClariomS和ClariomD/XTA阵列。