Suppr超能文献

比较基因组杂交数据的监督式局部加权散点平滑法归一化——在乳球菌菌株比较中的应用

Supervised Lowess normalization of comparative genome hybridization data--application to lactococcal strain comparisons.

作者信息

van Hijum Sacha A F T, Baerends Richard J S, Zomer Aldert L, Karsens Harma A, Martin-Requena Victoria, Trelles Oswaldo, Kok Jan, Kuipers Oscar P

机构信息

Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands.

出版信息

BMC Bioinformatics. 2008 Feb 11;9:93. doi: 10.1186/1471-2105-9-93.

Abstract

BACKGROUND

Array-based comparative genome hybridization (aCGH) is commonly used to determine the genomic content of bacterial strains. Since prokaryotes in general have less conserved genome sequences than eukaryotes, sequence divergences between the genes in the genomes used for an aCGH experiment obstruct determination of genome variations (e.g. deletions). Current normalization methods do not take into consideration sequence divergence between target and microarray features and therefore cannot distinguish a difference in signal due to systematic errors in the data or due to sequence divergence.

RESULTS

We present supervised Lowess, or S-Lowess, an application of the subset Lowess normalization method. By using a predicted subset of array features with minimal sequence divergence between the analyzed strains for the normalization procedure we remove systematic errors from dual-dye aCGH data in two steps: (1) determination of a subset of conserved genes (i.e. likely conserved genes, LCG); and (2) using the LCG for subset Lowess normalization. Subset Lowess determines the correction factors for systematic errors in the subset of array features and normalizes all array features using these correction factors. The performance of S-Lowess was assessed on aCGH experiments in which differentially labeled genomic DNA fragments of Lactococcus lactis IL1403 and L. lactis MG1363 strains were hybridized to IL1403 DNA microarrays. Since both genomes are sequenced and gene deletions identified, the success rate of different aCGH normalization methods in detecting these deletions in the MG1363 genome were determined. S-Lowess detects 97% of the deletions, whereas other aCGH normalization methods detect up to only 60% of the deletions.

CONCLUSION

S-Lowess is implemented in a user-friendly web-tool accessible from http://bioinformatics.biol.rug.nl/websoftware/s-lowess. We demonstrate that it outperforms existing normalization methods and maximizes detection of genomic variation (e.g. deletions) from microbial aCGH data.

摘要

背景

基于芯片的比较基因组杂交技术(aCGH)常用于确定细菌菌株的基因组内容。由于原核生物的基因组序列保守性总体上低于真核生物,aCGH实验中所用基因组中基因之间的序列差异会妨碍基因组变异(如缺失)的确定。当前的标准化方法未考虑靶标与芯片特征之间的序列差异,因此无法区分数据中的系统误差或序列差异导致的信号差异。

结果

我们提出了监督式局部加权散点平滑法(S-Lowess),这是一种子集局部加权散点平滑标准化方法的应用。通过在标准化过程中使用分析菌株之间序列差异最小的芯片特征预测子集,我们分两步从双色aCGH数据中消除系统误差:(1)确定保守基因子集(即可能保守的基因,LCG);(2)使用LCG进行子集局部加权散点平滑标准化。子集局部加权散点平滑法确定芯片特征子集中系统误差的校正因子,并使用这些校正因子对所有芯片特征进行标准化。在aCGH实验中评估了S-Lowess的性能,在该实验中,乳酸乳球菌IL1403和乳酸乳球菌MG1363菌株的差异标记基因组DNA片段与IL1403 DNA芯片杂交。由于两个基因组都已测序且已鉴定出基因缺失,因此确定了不同aCGH标准化方法在检测MG1363基因组中这些缺失方面的成功率。S-Lowess检测到97%的缺失,而其他aCGH标准化方法最多只能检测到60%的缺失。

结论

S-Lowess通过可从http://bioinformatics.biol.rug.nl/websoftware/s-lowess访问的用户友好型网络工具实现。我们证明它优于现有的标准化方法,并能最大程度地从微生物aCGH数据中检测基因组变异(如缺失)。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验