Suppr超能文献

爱沙尼亚人类基因组的测序和注释分析。

Sequencing and annotated analysis of an Estonian human genome.

机构信息

Department of Reproductive Biology, Estonian University of Life Sciences, Estonia.

出版信息

Gene. 2012 Feb 1;493(1):69-76. doi: 10.1016/j.gene.2011.11.022. Epub 2011 Nov 27.

Abstract

In present study we describe the sequencing and annotated analysis of the individual genome of Estonian. Using SOLID technology we generated 2,449,441,916 of 50-bp reads. The Bioscope version 1.3 was used for mapping and pairing of reads to the NCBI human genome reference (build 36, hg18). Bioscope enables also the annotation of the results of variant (tertiary) analysis. The average mapping of reads was 75.5% with total coverage of 107.72 Gb. resulting in mean fold coverage of 34.6. We found 3,482,975 SNPs out of which 352,492 were novel. 21,222 SNPs were in coding region: 10,649 were synonymous SNPs, 10,360 were nonsynonymous missense SNPs, 155 were nonsynonymous nonsense SNPs and 58 were nonsynonymous frameshifts. We identified 219 CNVs with total base pair coverage of 37,326,300 bp and 87,451 large insertion/deletion polymorphisms covering 10,152,256 bp of the genome. In addition, we found 285,864 small size insertion/deletion polymorphisms out of which 133,969 were novel. Finally, we identified 53 inversions, 19 overlapped genes and 2 overlapped exons. Interestingly, we found the region in chromosome 6 to be enriched with the coding SNPs and CNVs. This study confirms previous findings, that our genomes are more complex and variable as thought before. Therefore, sequencing of the personal genomes followed by annotation would improve the analysis of heritability of phenotypes and our understandings on the functions of genome.

摘要

在本研究中,我们描述了爱沙尼亚个体基因组的测序和注释分析。我们使用 SOLID 技术生成了 2449441916 个 50 碱基对的读取片段。Bioscope 版本 1.3 用于将读取片段映射和配对到 NCBI 人类基因组参考序列(版本 36,hg18)。Bioscope 还可以对变体(三级)分析结果进行注释。读取片段的平均映射率为 75.5%,总覆盖率为 107.72Gb,平均覆盖倍数为 34.6。我们发现了 3482975 个单核苷酸多态性(SNP),其中 352492 个是新的。21222 个 SNP 位于编码区:10649 个为同义 SNP,10360 个为非同义错义 SNP,155 个为非同义无义 SNP,58 个为非同义移码。我们鉴定了 219 个拷贝数变异(CNV),总碱基对覆盖度为 37326300bp,87451 个大插入/缺失多态性覆盖基因组的 10152256bp。此外,我们发现了 285864 个小插入/缺失多态性,其中 133969 个是新的。最后,我们鉴定了 53 个倒位、19 个重叠基因和 2 个重叠外显子。有趣的是,我们发现 6 号染色体上的区域富含编码 SNP 和 CNV。这项研究证实了之前的发现,即我们的基因组比以前认为的更复杂和多变。因此,对个人基因组进行测序并注释将提高对表型遗传性的分析,并加深我们对基因组功能的理解。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验