Suppr超能文献

利用探针对齐进行多种 arrayCGH 数据集的综合分类和分析。

Integrative classification and analysis of multiple arrayCGH datasets with probe alignment.

机构信息

Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, USA.

出版信息

Bioinformatics. 2010 Sep 15;26(18):2313-20. doi: 10.1093/bioinformatics/btq428. Epub 2010 Jul 21.

Abstract

MOTIVATION

Array comparative genomic hybridization (arrayCGH) is widely used to measure DNA copy numbers in cancer research. ArrayCGH data report log-ratio intensities of thousands of probes sampled along the chromosomes. Typically, the choices of the locations and the lengths of the probes vary in different experiments. This discrepancy in choosing probes poses a challenge in integrated classification or analysis across multiple arrayCGH datasets. We propose an alignment-based framework to integrate arrayCGH samples generated from different probe sets. The alignment framework seeks an optimal alignment between the probe series of one arrayCGH sample and the probe series of another sample, intended to find the maximum possible overlap of DNA copy number variations between the two measured chromosomes. An alignment kernel is introduced for integrative patient sample classification and a multiple alignment algorithm is also introduced for identifying common regions with copy number aberrations.

RESULTS

The probe alignment kernel and the MPA algorithm were experimented to integrate three bladder cancer datasets as well as artificial datasets. In the experiments, by integrating arrayCGH samples from multiple datasets, the probe alignment kernel used with support vector machines significantly improved patient sample classification accuracy over other baseline kernels. The experiments also demonstrated that the multiple probe alignment (MPA) algorithm can find common DNA aberrations that cannot be identified with the standard interpolation method. Furthermore, the MPA algorithm also identified many known bladder cancer DNA aberrations containing four known bladder cancer genes, three of which cannot be detected by interpolation.

AVAILABILITY

http://www.cs.umn.edu/compbio/ProbeAlign.

摘要

动机

阵列比较基因组杂交(arrayCGH)广泛用于测量癌症研究中的 DNA 拷贝数。arrayCGH 数据报告了数千个探针在染色体上采样的对数比强度。通常,在不同的实验中,探针的位置和长度的选择会有所不同。这种探针选择上的差异给多个 arrayCGH 数据集的综合分类或分析带来了挑战。我们提出了一种基于对齐的框架来整合来自不同探针集的 arrayCGH 样本。对齐框架旨在找到两个测量染色体之间 DNA 拷贝数变异的最大可能重叠,寻求一个 arrayCGH 样本的探针系列和另一个样本的探针系列之间的最佳对齐。引入了对齐核进行综合患者样本分类,并引入了多对齐算法来识别具有拷贝数异常的常见区域。

结果

对探针对齐核和 MPA 算法进行了实验,以整合三个膀胱癌数据集以及人工数据集。在实验中,通过整合来自多个数据集的 arrayCGH 样本,使用支持向量机的探针对齐核显著提高了患者样本分类的准确性,优于其他基线核。实验还表明,多探针对齐(MPA)算法可以找到无法用标准插值方法识别的常见 DNA 异常。此外,MPA 算法还识别了许多包含四个已知膀胱癌基因的膀胱癌 DNA 异常,其中三个不能通过插值检测到。

可用性

http://www.cs.umn.edu/compbio/ProbeAlign.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验