Zhu Ziyan, Liu Yuping, Zhang Shufei, Wang Sige, Yang Tianyan
Zhejiang Ocean University, Zhoushan, China Zhejiang Ocean University Zhoushan China.
Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, South China Sea Fisheries Research Institute, Guangzhou, China Guangdong Provincial Key Laboratory of Fishery Ecology and Environment, South China Sea Fisheries Research Institute Guangzhou China.
Biodivers Data J. 2023 Apr 7;11:e100068. doi: 10.3897/BDJ.11.e100068. eCollection 2023.
Microsatellite loci were screened from the genomic data of and their composition and distribution were analysed by bioinformatics for the first time. The results showed that 4,060,742 scaffolds with a total length of 1,562 Mb were obtained by high-throughput sequencing and 1,160,104 microsatellite loci were obtained by MISA screening, which were distributed on 770,294 scaffolds. The occurrence frequency and relative abundance were 28.57% and 743/Mb, respectively. Amongst the six complete microsatellite types, dinucleotide repeats accounted for the largest proportion (592,234, 51.05%), the highest occurrence frequency (14.58%) and the largest relative abundance (379.27/Mb). A total of 1488 microsatellite repeats were detected in the genome of , amongst which the hexanucleotide repeat motifs were the most abundant (608), followed by pentanucleotide repeat motifs (574), tetranucleotide repeat motifs (232), trinucleotide repeat motifs (59), dinucleotide repeat motifs (11) and mononucleotide repeat motifs (4). The abundance of microsatellites of the same repeat type decreased with the increase of copy numbers. Amongst the six types of nucleotide repeats, the preponderance of repeated motifs are A (191,390, 43.77%), CA (150,240, 25.37%), AAT (13,168, 14.05%), CACG (2,649, 8.14%), TAATG (119, 19.16%) and CCCTAA (190, 19.16%, 7.65%), respectively. The data of the number, distribution and abundance of different types of microsatellites in the genome of were obtained in this study, which would lay a foundation for the development of high-quality microsatellite markers of in the future.
首次从[物种名称]的基因组数据中筛选微卫星位点,并通过生物信息学分析其组成和分布。结果表明,通过高通量测序获得了4,060,742个支架,总长度为1,562 Mb,通过MISA筛选获得了1,160,104个微卫星位点,分布在770,294个支架上。出现频率和相对丰度分别为28.57%和743/Mb。在六种完整的微卫星类型中,二核苷酸重复占比最大(592,234个,51.05%),出现频率最高(14.58%),相对丰度最大(379.27/Mb)。在[物种名称]的基因组中共检测到1488个微卫星重复序列,其中六核苷酸重复基序最为丰富(608个),其次是五核苷酸重复基序(574个)、四核苷酸重复基序(232个)、三核苷酸重复基序(59个)、二核苷酸重复基序(11个)和单核苷酸重复基序(4个)。相同重复类型的微卫星丰度随拷贝数增加而降低。在六种核苷酸重复类型中,重复基序的优势分别为A(191,390个,43.77%)、CA(150,240个,25.37%)、AAT(13,168个,14.05%)、CACG(2,649个,8.14%)、TAATG(119个,19.16%)和CCCTAA(190个,19.16%,7.65%)。本研究获得了[物种名称]基因组中不同类型微卫星的数量、分布和丰度数据,为今后开发[物种名称]高质量微卫星标记奠定了基础。