Suppr超能文献

玻璃海鞘的二倍体基因组重建及与萨氏玻璃海鞘的比较分析。

Diploid genome reconstruction of Ciona intestinalis and comparative analysis with Ciona savignyi.

作者信息

Kim Jong Hyun, Waterman Michael S, Li Lei M

机构信息

Department of Computer Science, Yonsei University, Seoul, Republic of Korea.

出版信息

Genome Res. 2007 Jul;17(7):1101-10. doi: 10.1101/gr.5894107. Epub 2007 Jun 13.

Abstract

One of the main goals in genome sequencing projects is to determine a haploid consensus sequence even when clone libraries are constructed from homologous chromosomes. However, it has been noticed that haplotypes can be inferred from genome assemblies by investigating phase conservation in sequenced reads. In this study, we seek to infer haplotypes, a diploid consensus sequence, from the genome assembly of an organism, Ciona intestinalis. The Ciona intestinalis genome is an ideal resource from which haplotypes can be inferred because of the high polymorphism rate (1.2%). The haplotype estimation scheme consists of polymorphism detection and phase estimation. The core step of our method is a Gibbs sampling procedure. The mate-pair information from two-end sequenced clone inserts is exploited to provide long-range continuity. We estimate the polymorphism rate of Ciona intestinalis to be 1.2% and 1.5%, according to two different polymorphism counting schemes. The distribution of heterozygosity number is well fit by a compound Poisson distribution. The N50 length of haplotype segments is 37.9 kb in our assembly, while the N50 scaffold length of the Ciona intestinalis assembly is 190 kb. We also infer diploid gene sequences from haplotype segments. According to our reconstruction, 85.4% of predicted gene sequences are continuously covered by single haplotype segments. Our results indicate 97% accuracy in haplotype estimation, based on a simulated data set. We conduct a comparative analysis with Ciona savignyi, and discover interesting patterns of conserved DNA elements in chordates.

摘要

基因组测序项目的主要目标之一是确定单倍体一致序列,即便克隆文库是从同源染色体构建而来。然而,人们已经注意到,可以通过研究测序读段中的相位保守性,从基因组组装中推断单倍型。在本研究中,我们试图从一种生物——玻璃海鞘的基因组组装中推断单倍型,即二倍体一致序列。玻璃海鞘基因组是推断单倍型的理想资源,因为其多态率很高(1.2%)。单倍型估计方案包括多态性检测和相位估计。我们方法的核心步骤是一个吉布斯采样过程。利用来自两端测序克隆插入片段的配对信息来提供长程连续性。根据两种不同的多态性计数方案,我们估计玻璃海鞘的多态率分别为1.2%和1.5%。杂合子数量的分布很好地符合复合泊松分布。在我们的组装中,单倍型片段的N50长度为37.9 kb,而玻璃海鞘组装的N50支架长度为190 kb。我们还从单倍型片段推断二倍体基因序列。根据我们的重建,85.4%的预测基因序列被单个单倍型片段连续覆盖。基于一个模拟数据集,我们的结果表明单倍型估计的准确率为97%。我们与萨氏玻璃海鞘进行了比较分析,并发现了脊索动物中保守DNA元件的有趣模式。

相似文献

2
Accuracy assessment of diploid consensus sequences.二倍体共有序列的准确性评估。
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jan-Mar;4(1):88-97. doi: 10.1109/TCBB.2007.1007.

引用本文的文献

6
Brachyury, Foxa2 and the cis-Regulatory Origins of the Notochord.短尾蛋白、叉头框蛋白A2与脊索的顺式调控起源
PLoS Genet. 2015 Dec 18;11(12):e1005730. doi: 10.1371/journal.pgen.1005730. eCollection 2015 Dec.
8
Heterozygous genome assembly via binary classification of homologous sequence.通过同源序列的二元分类进行杂合基因组组装。
BMC Bioinformatics. 2015;16 Suppl 7(Suppl 7):S5. doi: 10.1186/1471-2105-16-S7-S5. Epub 2015 Apr 23.

本文引用的文献

1
Accuracy assessment of diploid consensus sequences.二倍体共有序列的准确性评估。
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jan-Mar;4(1):88-97. doi: 10.1109/TCBB.2007.1007.
8
Ultraconserved elements in the human genome.人类基因组中的超保守元件。
Science. 2004 May 28;304(5675):1321-5. doi: 10.1126/science.1098119. Epub 2004 May 6.
9
The diploid genome sequence of Candida albicans.白色念珠菌的二倍体基因组序列。
Proc Natl Acad Sci U S A. 2004 May 11;101(19):7329-34. doi: 10.1073/pnas.0401648101. Epub 2004 May 3.
10
Environmental genome shotgun sequencing of the Sargasso Sea.马尾藻海的环境基因组鸟枪法测序
Science. 2004 Apr 2;304(5667):66-74. doi: 10.1126/science.1093857. Epub 2004 Mar 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验