Liaocheng Research Institute of Donkey High-Efficiency Breeding and Ecological Feeding, Liaocheng University, Liaocheng, China.
City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China.
BMC Bioinformatics. 2019 Dec 20;20(Suppl 24):671. doi: 10.1186/s12859-019-3246-y.
Short tandem repeats (STRs) serve as genetic markers in forensic scenes due to their high polymorphism in eukaryotic genomes. A variety of STRs profiling systems have been developed for species including human, dog, cat, cattle, etc. Maintaining these systems simultaneously can be costly. These mammals share many high similar regions along their genomes. With the availability of the massive amount of the whole genomics data of these species, it is possible to develop a unified STR profiling system. In this study, our objective is to propose and develop a unified set of STR loci that could be simultaneously applied to multiple species.
To find a unified STR set, we collected the whole genome sequence data of the concerned species and mapped them to the human genome reference. Then we extracted the STR loci across the species. From these loci, we proposed an algorithm which selected a subset of loci by incorporating the optimized combined power of discrimination. Our results show that the unified set of loci have high combined power of discrimination, >1-10, for both individual species and the mixed population, as well as the random-match probability, <10 for all the involved species, indicating that the identified set of STR loci could be applied to multiple species.
We identified a set of STR loci which shared by multiple species. It implies that a unified STR profiling system is possible for these species under the forensic scenes. The system can be applied to the individual identification or paternal test of each of the ten common species which are Sus scrofa (pig), Bos taurus (cattle), Capra hircus (goat), Equus caballus (horse), Canis lupus familiaris (dog), Felis catus (cat), Ovis aries (sheep), Oryctolagus cuniculus (rabbit), and Bos grunniens (yak), and Homo sapiens (human). Our loci selection algorithm employed a greedy approach. The algorithm can generate the loci under different forensic parameters and for a specific combination of species.
短串联重复序列(STRs)在真核基因组中具有高度多态性,因此可作为法医现场的遗传标记。已经开发了各种用于鉴定人类、犬、猫、牛等物种的 STR 分析系统。同时维护这些系统可能成本高昂。这些哺乳动物在其基因组的许多高度相似区域中具有共享的序列。随着这些物种的大量全基因组数据的可用性,开发一个统一的 STR 分析系统成为可能。在这项研究中,我们的目标是提出并开发一套可同时应用于多种物种的统一 STR 基因座。
为了找到一个统一的 STR 集,我们收集了相关物种的全基因组序列数据,并将其映射到人类基因组参考序列上。然后,我们从这些物种中提取 STR 基因座。从这些基因座中,我们提出了一种算法,该算法通过整合优化的鉴别组合能力来选择基因座子集。我们的结果表明,该统一的 STR 基因座集具有高的鉴别组合能力(>1-10),适用于单个物种和混合群体,以及随机匹配概率(<10),适用于所有涉及的物种,这表明所确定的 STR 基因座集可应用于多种物种。
我们确定了一组可被多个物种共享的 STR 基因座。这意味着在法医现场,这些物种的统一 STR 分析系统是可能的。该系统可应用于 10 种常见物种(猪、牛、山羊、马、犬、猫、绵羊、兔和牦牛)和人类的个体识别或父系测试。我们的基因座选择算法采用了贪婪算法。该算法可以根据不同的法医参数和特定的物种组合生成基因座。