Yadav Shambhavi, Meena Rajendra Kumar, Godara Shruti, Shamoon Aarzoo, Kumar Kishan, Garg Rimpee, Thakur Ajay
Division of Genetics & Tree Improvement, ICFRE-Forest Research Institute, Dehradun, Uttarakhand 248195 India.
Department of Hematology and Stem Cell Transplantation, University Hospital Essen, University Duisberg-Essen, Essen, Germany.
3 Biotech. 2025 Feb;15(2):43. doi: 10.1007/s13205-025-04212-w. Epub 2025 Jan 16.
The natural population of have not been genetically enumerated due to a lack of genome sequence information or robust species-specific molecular marker. The present study was conducted to develop and validate genome-wide de novo simple sequence repeat (SSRs) markers in through shallow-pass genome sequencing. The genome sequence data of about 13 Gb was generated using Illumina technology, and high-quality sequence reads were de novo assembled into 1,390,995 contigs with GC content 42.34%, contig N 50 value 1047 bp. The Benchmark Universal Single-Copy Ortholog (BUSCO) analysis indicated 75.29% of complete and single-copy genome assembly. By scanning of genome assembly, a total of 73,468 simple sequence repeats (SSRs) were identified, and 44,383 primer pairs were designed. Repeat analysis revealed that the dinucleotide and trinucleotide repeats were most abundantly distributed in the genome with 52.95 and 41.17%, respectively. A subset of 33 SSRs was randomly selected for their PCR amplification and polymorphism in 16 random individuals. Of these, 29 SSRs were successfully amplified with the expected product size and 20 showed polymorphic banding patterns. Polymorphic SSRs were characterized by high expected heterozygosity ( = 0.72) and polymorphism information content (PIC = 0.68). The clustering pattern obtained using the neighbor joining (NJ) dendrogram revealed the genotypes were clustered in accordance with their geographical locations. The genomic and marker information generated in this study are novel and useful for future studies for genetic improvement and conservation of
由于缺乏基因组序列信息或强大的物种特异性分子标记,尚未对[物种名称]的自然种群进行基因计数。本研究旨在通过浅层基因组测序开发并验证[物种名称]全基因组从头简单序列重复(SSR)标记。使用Illumina技术生成了约13 Gb的基因组序列数据,高质量的序列读数被从头组装成1,390,995个重叠群,GC含量为42.34%,重叠群N50值为1047 bp。基准通用单拷贝直系同源物(BUSCO)分析表明,完整且单拷贝的基因组组装率为75.29%。通过扫描基因组组装,共鉴定出73,468个简单序列重复(SSR),并设计了44,383对引物。重复序列分析表明,二核苷酸和三核苷酸重复在基因组中分布最为丰富,分别占52.95%和41.17%。随机选择了33个SSR的子集,用于在16个随机个体中进行PCR扩增和多态性分析。其中,29个SSR成功扩增出预期大小的产物,20个显示出多态性条带模式。多态性SSR的特征是高预期杂合度(= 0.72)和多态性信息含量(PIC = 0.68)。使用邻接法(NJ)树状图获得的聚类模式表明,基因型根据其地理位置聚类。本研究中产生的基因组和标记信息是新颖的,对未来[物种名称]的遗传改良和保护研究有用。