• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类基因组序列。

The sequence of the human genome.

作者信息

Venter J C, Adams M D, Myers E W, Li P W, Mural R J, Sutton G G, Smith H O, Yandell M, Evans C A, Holt R A, Gocayne J D, Amanatides P, Ballew R M, Huson D H, Wortman J R, Zhang Q, Kodira C D, Zheng X H, Chen L, Skupski M, Subramanian G, Thomas P D, Zhang J, Gabor Miklos G L, Nelson C, Broder S, Clark A G, Nadeau J, McKusick V A, Zinder N, Levine A J, Roberts R J, Simon M, Slayman C, Hunkapiller M, Bolanos R, Delcher A, Dew I, Fasulo D, Flanigan M, Florea L, Halpern A, Hannenhalli S, Kravitz S, Levy S, Mobarry C, Reinert K, Remington K, Abu-Threideh J, Beasley E, Biddick K, Bonazzi V, Brandon R, Cargill M, Chandramouliswaran I, Charlab R, Chaturvedi K, Deng Z, Di Francesco V, Dunn P, Eilbeck K, Evangelista C, Gabrielian A E, Gan W, Ge W, Gong F, Gu Z, Guan P, Heiman T J, Higgins M E, Ji R R, Ke Z, Ketchum K A, Lai Z, Lei Y, Li Z, Li J, Liang Y, Lin X, Lu F, Merkulov G V, Milshina N, Moore H M, Naik A K, Narayan V A, Neelam B, Nusskern D, Rusch D B, Salzberg S, Shao W, Shue B, Sun J, Wang Z, Wang A, Wang X, Wang J, Wei M, Wides R, Xiao C, Yan C, Yao A, Ye J, Zhan M, Zhang W, Zhang H, Zhao Q, Zheng L, Zhong F, Zhong W, Zhu S, Zhao S, Gilbert D, Baumhueter S, Spier G, Carter C, Cravchik A, Woodage T, Ali F, An H, Awe A, Baldwin D, Baden H, Barnstead M, Barrow I, Beeson K, Busam D, Carver A, Center A, Cheng M L, Curry L, Danaher S, Davenport L, Desilets R, Dietz S, Dodson K, Doup L, Ferriera S, Garg N, Gluecksmann A, Hart B, Haynes J, Haynes C, Heiner C, Hladun S, Hostin D, Houck J, Howland T, Ibegwam C, Johnson J, Kalush F, Kline L, Koduru S, Love A, Mann F, May D, McCawley S, McIntosh T, McMullen I, Moy M, Moy L, Murphy B, Nelson K, Pfannkoch C, Pratts E, Puri V, Qureshi H, Reardon M, Rodriguez R, Rogers Y H, Romblad D, Ruhfel B, Scott R, Sitter C, Smallwood M, Stewart E, Strong R, Suh E, Thomas R, Tint N N, Tse S, Vech C, Wang G, Wetter J, Williams S, Williams M, Windsor S, Winn-Deen E, Wolfe K, Zaveri J, Zaveri K, Abril J F, Guigó R, Campbell M J, Sjolander K V, Karlak B, Kejariwal A, Mi H, Lazareva B, Hatton T, Narechania A, Diemer K, Muruganujan A, Guo N, Sato S, Bafna V, Istrail S, Lippert R, Schwartz R, Walenz B, Yooseph S, Allen D, Basu A, Baxendale J, Blick L, Caminha M, Carnes-Stine J, Caulk P, Chiang Y H, Coyne M, Dahlke C, Deslattes Mays A, Dombroski M, Donnelly M, Ely D, Esparham S, Fosler C, Gire H, Glanowski S, Glasser K, Glodek A, Gorokhov M, Graham K, Gropman B, Harris M, Heil J, Henderson S, Hoover J, Jennings D, Jordan C, Jordan J, Kasha J, Kagan L, Kraft C, Levitsky A, Lewis M, Liu X, Lopez J, Ma D, Majoros W, McDaniel J, Murphy S, Newman M, Nguyen T, Nguyen N, Nodell M, Pan S, Peck J, Peterson M, Rowe W, Sanders R, Scott J, Simpson M, Smith T, Sprague A, Stockwell T, Turner R, Venter E, Wang M, Wen M, Wu D, Wu M, Xia A, Zandieh A, Zhu X

机构信息

Celera Genomics, 45 West Gude Drive, Rockville, MD 20850, USA.

出版信息

Science. 2001 Feb 16;291(5507):1304-51. doi: 10.1126/science.1058040.

DOI:10.1126/science.1058040
PMID:11181995
Abstract

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

摘要

通过全基因组鸟枪法测序方法,生成了人类基因组常染色质部分29.1亿碱基对(bp)的一致序列。148亿碱基对的DNA序列是在9个月内,从5个人的DNA构建的质粒克隆两端的27271853个高质量序列读数(基因组覆盖度为5.11倍)中产生的。使用了两种组装策略——全基因组组装和区域染色体组装,每种策略都结合了赛雷拉公司和公共资助基因组计划的序列数据。公共数据被切割成550bp的片段,以对已测序的基因组区域产生2.9倍的覆盖度,且不包括公共资助团队所用克隆和组装过程中固有的偏差。这使得组装中的有效覆盖度达到8倍,减少了最终组装中缺口的数量和大小,相比5.11倍覆盖度所得到的结果有所改善。两种组装策略产生了非常相似的结果,在很大程度上与独立的图谱数据一致。这些组装有效地覆盖了人类染色体的常染色质区域。超过90%的基因组存在于100000bp或更长的支架组装中,25%的基因组存在于1000万bp或更大的支架中。对基因组序列的分析揭示了26588个有确凿证据支持的蛋白质编码转录本,以及另外约12000个通过计算推导且与小鼠匹配或有其他微弱支持证据的基因。尽管基因密集簇很明显,但几乎一半的基因分散在低G+C序列中,被大片明显非编码序列隔开。基因组中只有1.1%由外显子覆盖,而24%存在于内含子中,75%的基因组是基因间DNA。大小可达染色体长度的片段块重复在整个基因组中大量存在,揭示了复杂的进化历史。比较基因组分析表明,与神经元功能、组织特异性发育调控以及止血和免疫系统相关的基因在脊椎动物中有所扩张。一致序列与公共资助基因组数据之间的DNA序列比较确定了210万个单核苷酸多态性(SNP)的位置。一对随机的人类单倍体基因组平均每1250个碱基对中有1个碱基对存在差异,但全基因组多态性水平存在显著异质性。所有SNP中不到1%导致蛋白质变异,但确定哪些SNP具有功能后果的任务仍然是一个悬而未决的挑战。

相似文献

1
The sequence of the human genome.人类基因组序列。
Science. 2001 Feb 16;291(5507):1304-51. doi: 10.1126/science.1058040.
2
Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes.红鳍东方鲀全基因组鸟枪法测序组装与基因组分析
Science. 2002 Aug 23;297(5585):1301-10. doi: 10.1126/science.1072104. Epub 2002 Jul 25.
3
The human genome.人类基因组。
Science. 2001 Feb 16;291(5507):1177-80. doi: 10.1126/science.291.5507.1177.
4
Single haplotype assembly of the human genome from a hydatidiform mole.来自葡萄胎的人类基因组单倍型组装
Genome Res. 2014 Dec;24(12):2066-76. doi: 10.1101/gr.180893.114. Epub 2014 Nov 4.
5
The diploid genome sequence of an individual human.某个人类个体的二倍体基因组序列。
PLoS Biol. 2007 Sep 4;5(10):e254. doi: 10.1371/journal.pbio.0050254.
6
Finishing the euchromatic sequence of the human genome.完成人类基因组的常染色质序列测定。
Nature. 2004 Oct 21;431(7011):931-45. doi: 10.1038/nature03001.
7
Re-annotation of the physical map of Glycine max for polyploid-like regions by BAC end sequence driven whole genome shotgun read assembly.通过BAC末端序列驱动的全基因组鸟枪法读取组装对大豆多倍体样区域的物理图谱进行重新注释。
BMC Genomics. 2008 Jul 7;9:323. doi: 10.1186/1471-2164-9-323.
8
Structural analysis of a Lotus japonicus genome. II. Sequence features and mapping of sixty-five TAC clones which cover the 6.5-mb regions of the genome.百脉根基因组的结构分析。II. 覆盖基因组6.5兆碱基区域的65个TAC克隆的序列特征与图谱绘制。
DNA Res. 2002 Apr 30;9(2):63-70. doi: 10.1093/dnares/9.2.63.
9
Polymorphic segmental duplications at 8p23.1 challenge the determination of individual defensin gene repertoires and the assembly of a contiguous human reference sequence.8p23.1处的多态性节段重复对个体防御素基因库的确定以及连续人类参考序列的组装提出了挑战。
BMC Genomics. 2004 Dec 10;5(1):92. doi: 10.1186/1471-2164-5-92.
10
Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome.评估GS FLX焦磷酸测序技术用于大西洋鲑鱼基因组测序的可行性。
BMC Genomics. 2008 Aug 28;9:404. doi: 10.1186/1471-2164-9-404.

引用本文的文献

1
Survey of genetic testing, community involvement, and vision care in Albinism.白化病的基因检测、社区参与及视力保健调查
J Med Access. 2025 Sep 1;9:27550834251371501. doi: 10.1177/27550834251371501. eCollection 2025 Jan-Dec.
2
Pre-training Genomic Language Model with Variants for Better Modeling Functional Genomics.使用变异体预训练基因组语言模型以更好地建模功能基因组学。
bioRxiv. 2025 Aug 23:2025.02.26.640468. doi: 10.1101/2025.02.26.640468.
3
Pharmacogenomics of steroid-induced ocular hypertension: relationship to high-tension glaucomas and new pathophysiologic insight.
类固醇性高眼压症的药物基因组学:与高眼压型青光眼的关系及新的病理生理学见解
medRxiv. 2025 Aug 13:2025.08.11.25333245. doi: 10.1101/2025.08.11.25333245.
4
Precision Oncology Guided by Genomic Profiling in Breast Cancer: Real-World Data from a Molecular Tumor Board.基于基因组分析的乳腺癌精准肿瘤学:来自分子肿瘤学委员会的真实世界数据
Cancers (Basel). 2025 Jul 23;17(15):2435. doi: 10.3390/cancers17152435.
5
A multi-omics analysis of human fibroblasts overexpressing an transposon reveals widespread disruptions in aging-associated pathways.对过表达转座子的人成纤维细胞进行的多组学分析揭示了衰老相关通路中广泛的破坏。
bioRxiv. 2025 Jul 17:2025.07.11.664466. doi: 10.1101/2025.07.11.664466.
6
Late steps of allelic break-induced replication suppress tandem duplication associated with BRCA1 deficiency.等位基因断裂诱导复制的后期步骤抑制与BRCA1缺陷相关的串联重复。
Nucleic Acids Res. 2025 Jul 19;53(14). doi: 10.1093/nar/gkaf729.
7
Single-Molecule Enzyme Activity Analysis for Illuminating Pathological Proteoforms.用于阐明病理性蛋白质异构体的单分子酶活性分析
ACS Cent Sci. 2025 Jun 17;11(7):1041-1051. doi: 10.1021/acscentsci.5c00100. eCollection 2025 Jul 23.
8
Different Species of Bats: Genomics, Transcriptome, and Immune Repertoire.不同种类的蝙蝠:基因组学、转录组和免疫库
Curr Issues Mol Biol. 2025 Apr 7;47(4):252. doi: 10.3390/cimb47040252.
9
A Notch signal required for a morphological novelty in has antecedent functions in genital disc eversion.在[具体内容未给出]中一种形态新奇所需的Notch信号在生殖盘外翻中具有先前的功能。
Sci Adv. 2025 Jul 18;11(29):eadt7825. doi: 10.1126/sciadv.adt7825. Epub 2025 Jul 16.
10
HiC4D-SPOT: a spatiotemporal outlier detection tool for Hi-C data.HiC4D-SPOT:一种用于Hi-C数据的时空异常检测工具。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf341.