Department of Biological Sciences, Auburn University, Auburn, AL, United States of America.
Department of Biological Sciences, University of Alabama - Tuscaloosa, Tuscaloosa, AL, United States of America.
PeerJ. 2024 Sep 13;12:e18100. doi: 10.7717/peerj.18100. eCollection 2024.
Genetically modified organisms are commonly used in disease research and agriculture but the precise genomic alterations underlying transgenic mutations are often unknown. The position and characteristics of transgenes, including the number of independent insertions, influences the expression of both transgenic and wild-type sequences. We used long-read, Oxford Nanopore Technologies (ONT) to sequence and assemble two transgenic strains of commonly used in the research of neurodegenerative diseases: BY250 (pPdat-1::GFP) and UA44 (GFP and human -synuclein), a model for Parkinson's research. After scaffolding to the reference, the final assembled sequences were ∼102 Mb with N50s of 17.9 Mb and 18.0 Mb, respectively, and L90s of six contiguous sequences, representing chromosome-level assemblies. Each of the assembled sequences contained more than 99.2% of the Nematoda BUSCO genes found in the reference and 99.5% of the annotated reference protein-coding genes. We identified the locations of the transgene insertions and confirmed that all transgene sequences were inserted in intergenic regions, leaving the organismal gene content intact. The transgenic genomes presented here will be a valuable resource for Parkinson's research as well as other neurodegenerative diseases. Our work demonstrates that long-read sequencing is a fast, cost-effective way to assemble genome sequences and characterize mutant lines and strains.
转基因生物通常用于疾病研究和农业领域,但转基因突变背后的确切基因组改变往往未知。转基因的位置和特征,包括独立插入的数量,都会影响转基因和野生型序列的表达。我们使用长读长、牛津纳米孔技术(ONT)对两种常用于神经退行性疾病研究的转基因品系进行测序和组装:BY250(pPdat-1::GFP)和 UA44(GFP 和人 -突触核蛋白),这是帕金森病研究的模型。在参照系支架之后,最终组装的序列分别约为 102 Mb,N50 分别为 17.9 Mb 和 18.0 Mb,L90 分别为 6 个连续序列,代表染色体水平的组装。每个组装序列都包含超过 99.2%的在参照系中发现的线虫 BUSCO 基因和 99.5%的注释参照系蛋白编码基因。我们确定了转基因插入的位置,并证实所有转基因序列都插入了基因间区,使生物体的基因内容保持完整。这里呈现的转基因基因组将成为帕金森病研究以及其他神经退行性疾病的宝贵资源。我们的工作表明,长读测序是一种快速、具有成本效益的方法,可以组装基因组序列并对突变株系进行特征描述。