Xie Weibo, Chen Ying, Zhou Gang, Wang Lei, Zhang Chengjun, Zhang Jianwei, Xiao Jinghua, Zhu Tong, Zhang Qifa
National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan, China.
Theor Appl Genet. 2009 Jun;119(1):151-64. doi: 10.1007/s00122-009-1025-2. Epub 2009 Apr 16.
Expression levels measured in microarrays of oligonucleotide probes have now been adapted as a high throughput approach for identifying DNA sequence variation between genotypes, referred to as single feature polymorphisms (SFPs). Although there have been increasing interests in this approach, there is still need for improving the algorithm in order to achieve high sensitivity and specificity especially with complex genome and large datasets, while maintaining optimal computational performance. We obtained microarray datasets for expression profiles of two rice cultivars and adapted a median polish method to detect SFPs. The analysis identified 6,655 SFPs between two the rice varieties representing 3,131 rice unique genes. We showed that the median polish method has the advantage of avoiding fitting complex linear models thus can be used to analyze complex transcriptome datasets like the ones in this study. The method is also superior in sensitivity, accuracy and computing time requirement compared with two previously used methods. A comparison with data from a resequencing project indicated that 75.6% of the SFPs had SNP supports in the probe regions. Further comparison revealed that SNPs in sequences immediately flanking the probes also had contributions to the detection of SFPs in cases where the probes and the targets had perfectly matched sequences. It was shown that differences in minimum free energies caused by flanking SNPs, which may change the stability of RNA secondary structure, may partly explain the SFPs as detected. These SFPs may facilitate gene discovery in future studies.
在寡核苷酸探针微阵列中测量的表达水平现已被用作一种高通量方法,用于识别不同基因型之间的DNA序列变异,即单特征多态性(SFP)。尽管人们对这种方法的兴趣与日俱增,但仍需要改进算法,以实现高灵敏度和特异性,特别是在处理复杂基因组和大型数据集时,同时保持最佳的计算性能。我们获得了两个水稻品种表达谱的微阵列数据集,并采用中位数平滑法来检测SFP。分析确定了两个水稻品种之间的6655个SFP,代表3131个水稻独特基因。我们表明,中位数平滑法具有避免拟合复杂线性模型的优点,因此可用于分析像本研究中的复杂转录组数据集。与之前使用的两种方法相比,该方法在灵敏度、准确性和计算时间要求方面也更具优势。与重测序项目的数据比较表明,75.6%的SFP在探针区域有单核苷酸多态性(SNP)支持。进一步比较发现,在探针与靶标序列完全匹配的情况下,紧邻探针的序列中的SNP也对SFP的检测有贡献。结果表明,侧翼SNP引起的最小自由能差异可能会改变RNA二级结构的稳定性,这可能部分解释了所检测到的SFP。这些SFP可能会在未来的研究中促进基因发现。