Institute for Advanced Study, Chengdu University, Chengdu, China.
Key Laboratory of Bio-resources and Eco-environment, Ministry of Education, College of Life Science, Sichuan University, Chengdu, China.
Mol Ecol Resour. 2020 Jan;20(1):283-291. doi: 10.1111/1755-0998.13098. Epub 2019 Oct 28.
Microsatellites are widely distributed throughout nearly all genomes which have been extensively exploited as powerful genetic markers for diverse applications due to their high polymorphisms. Their length variations are involved in gene regulation and implicated in numerous genetic diseases even in cancers. Although much effort has been devoted in microsatellite database construction, the existing microsatellite databases still had some drawbacks, such as limited number of species, unfriendly export format, missing marker development, lack of compound microsatellites and absence of gene annotation, which seriously restricted researchers to perform downstream analysis. In order to overcome the above limitations, we developed PSMD (Pan-Species Microsatellite Database, http://big.cdu.edu.cn/psmd/) as a web-based database to facilitate researchers to easily identify microsatellites, exploit reliable molecular markers and compare microsatellite distribution pattern on genome-wide scale. In current release, PSMD comprises 678,106,741 perfect microsatellites and 43,848,943 compound microsatellites from 18,408 organisms, which covered almost all species with available genomic data. In addition to interactive browse interface, PSMD also offers a flexible filter function for users to quickly gain desired microsatellites from large data sets. PSMD allows users to export GFF3 formatted file and CSV formatted statistical file for downstream analysis. We also implemented an online tool for analysing occurrence of microsatellites with user-defined parameters. Furthermore, Primer3 was embedded to help users to design high-quality primers with customizable settings. To our knowledge, PSMD is the most extensive resource which is likely to be adopted by scientists engaged in biological, medical, environmental and agricultural research.
微卫星广泛分布于几乎所有基因组中,由于其高度多态性,已被广泛用作各种应用的强大遗传标记。它们的长度变化涉及基因调控,并与许多遗传疾病甚至癌症有关。尽管在微卫星数据库构建方面做了大量工作,但现有的微卫星数据库仍然存在一些缺点,例如物种数量有限、不友好的输出格式、缺少标记开发、缺少复合微卫星和缺乏基因注释,这严重限制了研究人员进行下游分析。为了克服上述限制,我们开发了 PSMD(泛物种微卫星数据库,http://big.cdu.edu.cn/psmd/)作为一个基于网络的数据库,以方便研究人员轻松识别微卫星,利用可靠的分子标记,并在全基因组范围内比较微卫星的分布模式。在当前版本中,PSMD 包含了来自 18408 个生物体的 678106741 个完美微卫星和 43848943 个复合微卫星,几乎涵盖了所有具有可用基因组数据的物种。除了交互式浏览界面,PSMD 还为用户提供了灵活的过滤功能,以便从大型数据集中快速获得所需的微卫星。PSMD 允许用户导出 GFF3 格式的文件和 CSV 格式的统计文件,用于下游分析。我们还实现了一个在线工具,用于根据用户定义的参数分析微卫星的出现。此外,还嵌入了 Primer3 来帮助用户使用可定制设置设计高质量的引物。据我们所知,PSMD 是最广泛的资源,很可能被从事生物、医学、环境和农业研究的科学家采用。