Department of Veterinary Science, College of Agriculture, Food, and Environment, University of Kentucky, Lexington, KY (T.S.K., N.A.H., K.L.).
Texas A&M Institute for Genome Sciences and Society, Texas A&M University, College Station, TX (W.A.B., K.J.K., A.E.H.).
Hypertension. 2023 Jan;80(1):138-146. doi: 10.1161/HYPERTENSIONAHA.122.20140. Epub 2022 Nov 4.
We report the creation and evaluation of a de novo assembly of the genome of the spontaneously hypertensive rat, the most widely used model of human cardiovascular disease.
The genome is assembled from long read sequencing (PacBio HiFi and continuous long read data [CLR]) and scaffolded with long-range structural information obtained from Bionano optical maps and proximity ligation sequencing proximity analysis of the genome. The genome assembly was polished with Illumina short reads. Completeness of the assembly was investigated using Benchmarking Universal Single Copy Orthologs analysis. The genome assembly was also evaluated with the rat reference gene set, using NCBI automated protocols. We also generated orthogonal single molecule transcript sequence reads (Iso-Seq) from 8 tissues and used them to validate the coding assembly, to annotate the assembly with RNA transcripts representing unique full length transcript isoforms for each gene and to determine whether divergences between RefSeq sequences and the assembly were attributable to assembly errors or polymorphisms.
The assembly analysis indicates that this assembly is comparable in contiguity and completeness to the current rat reference assembly, while the use of HiFi sequencing yields an assembly that is more correct at the single base level. Synteny analysis was performed to uncover the extent of synteny and the presence and distribution of chromosomal rearrangements between the reference and this assembly.
The resulting genome assembly is reference quality and captures significant structural variation.
我们报告了自发性高血压大鼠基因组从头组装的创建和评估,该大鼠是最广泛用于人类心血管疾病模型的品系。
使用长读测序(PacBio HiFi 和连续长读数据[CLR])进行基因组组装,并利用 Bionano 光学图谱和基因组的邻近连接测序邻近分析获得的长程结构信息进行支架搭建。使用 Illumina 短读对基因组组装进行了精修。使用基准通用单拷贝同源基因分析来研究组装的完整性。使用 NCBI 自动化协议,还使用大鼠参考基因集对基因组组装进行了评估。我们还从 8 种组织中生成了正交单分子转录序列读(Iso-Seq),并将其用于验证编码组装,使用代表每个基因的独特全长转录异构体的 RNA 转录本对组装进行注释,并确定 RefSeq 序列与组装之间的差异是归因于组装错误还是多态性。
组装分析表明,与当前的大鼠参考组装相比,该组装在连续性和完整性方面具有可比性,而使用 HiFi 测序可生成在单碱基水平上更准确的组装。进行了同线性分析,以揭示参考基因组和本组装之间的同线性程度以及染色体重排的存在和分布。
所得到的基因组组装具有参考质量,并捕获了大量的结构变异。