U87MG 解码:一条染色体结构异常的人类癌细胞系的基因组序列。

U87MG decoded: the genomic sequence of a cytogenetically aberrant human cancer cell line.

机构信息

Department of Human Genetics, University of California Los Angeles, Los Angeles, California, United States of America.

出版信息

PLoS Genet. 2010 Jan 29;6(1):e1000832. doi: 10.1371/journal.pgen.1000832.

Abstract

U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30x genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate-pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100 bp), 191,743 small (<21 bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions, and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions, and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date.

摘要

U87MG 是一种常见的 IV 级神经胶质瘤细胞系,在过去四十年中,已经有至少 1700 篇文献对其进行了分析。为了全面描述该细胞系的基因组,并作为广泛癌症基因组测序的模型,我们使用一种新的 50 碱基对配对策略和平均插入文库为 1.4kb 的长片段文库生成了大于 30 倍的基因组序列覆盖度。总共从五个载玻片上生成了 1014984286 个 mate-end 和 120691623 个单端双碱基编码读取。所有数据都使用一个名为 BFAST 的定制设计工具进行对齐,从而实现了最佳颜色空间读取对齐和准确的 DNA 变异识别。经过比对的序列读取和 mate-pair 信息鉴定出 35 个染色体间易位事件、1315 个结构变异(>100bp)、191743 个小(<21bp)插入和缺失(indels)以及 2384470 个单核苷酸变异(SNVs)。在这些观察结果中,PTEN 的已知纯合突变得到了稳健鉴定,并且细胞黏附相关基因在突变基因列表中过度表达。数据与通过 Illumina 1M Duo 基因分型阵列检测到的 219187 个杂合单核苷酸多态性进行了比较,以评估准确性:在产生大于 99.99%序列准确性的过滤阈值下,所有 SNP 的可靠检测率为 93.83%。由于小 indels、大缺失和易位,蛋白质编码序列主要在这个癌细胞系中被破坏。总的来说,有 512 个基因发生了纯合突变,包括 154 个由 SNVs 引起,178 个由小 indels 引起,145 个由大片段缺失引起,35 个由染色体间易位引起,从而揭示了一个高度突变的细胞系基因组。在小的纯合突变中,有 8 个 SNVs 和 99 个 indels 是新的事件,不在 dbSNP 中。这些数据表明,在基因组中心之外也可以常规生成广泛的癌症基因组序列。与迄今为止的任何细胞系相比,U87MG 的序列分析提供了无与伦比的突变分辨率水平。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ecf/2813426/901c5fef9ca3/pgen.1000832.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索