• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SSR_VibraProfiler:一个用于使用具有品种内特异性和品种间多态性的简单序列重复(SSR)对品种进行准确分类的Python软件包。

SSR_VibraProfiler: a Python package for accurate classification of varieties using SSRs with intra-variety specificity and inter-variety polymorphism.

作者信息

Jiang Chenhao, Dong Chuan, Wu Zhenzhen, Shi Chenyi, Ye Qiannan, Wu Xiaopei, Ma Siyi, Wen Yuming, Yu Guoping, Wu Jiasheng, Zhang Chengjun

机构信息

National Key Laboratory for Development and Utilization of Forest Food Resources, Zhejiang A & F University, Hangzhou, Zhejiang, 311300, China.

Germplasm Bank of Wild Species & Yunnan Key Laboratory of Crop Wild Relatives Omics, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China.

出版信息

Plant Methods. 2025 May 16;21(1):61. doi: 10.1186/s13007-025-01380-x.

DOI:10.1186/s13007-025-01380-x
PMID:40380148
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12082954/
Abstract

BACKGROUND

Simple sequence repeats (SSRs) are widely used as molecular markers; however, traditional development of SSR molecular markers heavily relies on experimental methods. The advancement of modern sequencing technology has provided the possibility of directly extracting SSR characteristics from sequencing data and using them for variety identification.

RESULTS

We have developed a computational framework for variety identification, treating the presence or absence of each SSR in sequencing data as a numerical characteristic while ignoring specific loci, flanking sequences, and occurrence counts. Therefore, subsequent variety identification does not rely on experimental validation but is directly performed based on the numerical characteristic matrix. Using a formula, we measure the variance of these numerical characteristics both within and among varieties, and select SSRs that exhibit intra-variety specificity and inter-variety polymorphism, forming a 0,1 matrix. We use t-SNE (t-distributed Stochastic Neighbor Embedding) to project the matrix onto a two-dimensional plane, followed by K-means clustering of the individuals. The classification performance of the matrix is preliminarily assessed by comparing the cluster labels with the true labels, providing an initial evaluation of its effectiveness in variety detection. Ultimately, we construct a recognition model based on the SSRs matrix and apply it for variety identification. The process has been encapsulated into the package SSR_VibraProfiler, which can serve as a tool for constructing an SSR variety DNA fingerprint database. We tested this package on a Rhododendron dataset that included 40 individuals from 8 varieties. The accuracy achieved through t-SNE dimensionality reduction and K-means clustering was 100%. Furthermore, we used the leave-one-out method to validate the accuracy of our method in predicting variety, and confirmed the reliability of our method in detecting varieties. The package is freely available at https://github.com/Olcat35412/SSR_VibraProfiler .

CONCLUSION

We introduced SSR_VibraProfiler, a Python package for distinguishing and predicting individual varieties without a reference genome by extracting SSR numerical characteristics from next-generation sequencing data. This tool will contribute to the development, identification, and protection of new varieties.

摘要

背景

简单序列重复(SSRs)被广泛用作分子标记;然而,传统的SSR分子标记开发严重依赖实验方法。现代测序技术的进步为直接从测序数据中提取SSR特征并将其用于品种鉴定提供了可能性。

结果

我们开发了一个用于品种鉴定的计算框架,将测序数据中每个SSR的存在与否视为一个数值特征,而忽略特定位点、侧翼序列和出现次数。因此,后续的品种鉴定不依赖实验验证,而是直接基于数值特征矩阵进行。我们使用一个公式来测量这些数值特征在品种内和品种间的方差,并选择表现出品种内特异性和品种间多态性的SSR,形成一个0,1矩阵。我们使用t-SNE(t分布随机邻域嵌入)将矩阵投影到二维平面上,然后对个体进行K均值聚类。通过将聚类标签与真实标签进行比较,初步评估矩阵的分类性能,为其在品种检测中的有效性提供初步评价。最终,我们基于SSR矩阵构建识别模型并将其应用于品种鉴定。该过程已被封装到SSR_VibraProfiler软件包中,该软件包可作为构建SSR品种DNA指纹数据库的工具。我们在一个包含来自8个品种的40个个体的杜鹃花数据集上测试了这个软件包。通过t-SNE降维和K均值聚类实现的准确率为100%。此外,我们使用留一法验证了我们的方法在预测品种方面的准确性,并证实了我们的方法在检测品种方面的可靠性。该软件包可在https://github.com/Olcat35412/SSR_VibraProfiler上免费获取。

结论

我们介绍了SSR_VibraProfiler,这是一个Python软件包,通过从下一代测序数据中提取SSR数值特征来区分和预测个体品种,而无需参考基因组。该工具将有助于新品种的开发、鉴定和保护。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/a26cea7c5586/13007_2025_1380_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/71dd92321f13/13007_2025_1380_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/8a4b55aa12ce/13007_2025_1380_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/a26cea7c5586/13007_2025_1380_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/71dd92321f13/13007_2025_1380_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/8a4b55aa12ce/13007_2025_1380_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d3d4/12082954/a26cea7c5586/13007_2025_1380_Fig3_HTML.jpg

相似文献

1
SSR_VibraProfiler: a Python package for accurate classification of varieties using SSRs with intra-variety specificity and inter-variety polymorphism.SSR_VibraProfiler:一个用于使用具有品种内特异性和品种间多态性的简单序列重复(SSR)对品种进行准确分类的Python软件包。
Plant Methods. 2025 May 16;21(1):61. doi: 10.1186/s13007-025-01380-x.
2
An integrated strategy for target SSR genotyping with toleration of nucleotide variations in the SSRs and flanking regions.一种集成的 SSR 目标基因分型策略,可容忍 SSR 和侧翼区域中的核苷酸变异。
BMC Bioinformatics. 2021 Sep 8;22(1):429. doi: 10.1186/s12859-021-04351-w.
3
Target SSR-Seq: A Novel SSR Genotyping Technology Associate With Perfect SSRs in Genetic Analysis of Cucumber Varieties.目标SSR测序:一种与完美SSR相关的新型SSR基因分型技术在黄瓜品种遗传分析中的应用
Front Plant Sci. 2019 Apr 24;10:531. doi: 10.3389/fpls.2019.00531. eCollection 2019.
4
Genetic diversity analysis and variety identification using SSR and SNP markers in melon.利用 SSR 和 SNP 标记进行瓜类遗传多样性分析和品种鉴定。
BMC Plant Biol. 2023 Jan 18;23(1):39. doi: 10.1186/s12870-023-04056-7.
5
SATIN: a micro and mini satellite mining tool of total genome and coding regions with analysis of perfect repeats polymorphism in coding regions.SATIN:一种微小型卫星全基因组和编码区挖掘工具,可分析编码区完全重复多态性。
BMC Bioinformatics. 2024 Jun 18;25(1):217. doi: 10.1186/s12859-024-05842-2.
6
Characterization of Simple Sequence Repeat (SSR) Markers Mined in Whole Grape Genomes.从全基因组中挖掘的简单重复序列(SSR)标记的特征。
Genes (Basel). 2023 Mar 7;14(3):663. doi: 10.3390/genes14030663.
7
Genomic survey sequencing for development and validation of single-locus SSR markers in peanut (Arachis hypogaea L.).用于花生(Arachis hypogaea L.)单基因座SSR标记开发与验证的基因组调查测序
BMC Genomics. 2016 Jun 1;17:420. doi: 10.1186/s12864-016-2743-x.
8
Genome-wide identification of microsatellite markers from cultivated peanut (Arachis hypogaea L.).从栽培花生(Arachis hypogaea L.)中进行全基因组微卫星标记的鉴定。
BMC Genomics. 2019 Nov 1;20(1):799. doi: 10.1186/s12864-019-6148-5.
9
Large-scale identification of polymorphic microsatellites using an in silico approach.利用计算机模拟方法大规模鉴定多态性微卫星。
BMC Bioinformatics. 2008 Sep 15;9:374. doi: 10.1186/1471-2105-9-374.
10
Exploitation of pepper EST-SSRs and an SSR-based linkage map.辣椒EST-SSR的开发及基于SSR的连锁图谱
Theor Appl Genet. 2006 Dec;114(1):113-30. doi: 10.1007/s00122-006-0415-y. Epub 2006 Oct 18.

本文引用的文献

1
Association of Glutathione Transferase M1, T1, P1 and A1 Gene Polymorphism and Susceptibility to IgA Vasculitis.谷胱甘肽转移酶 M1、T1、P1 和 A1 基因多态性与 IgA 血管炎易感性的关联。
Int J Mol Sci. 2024 Jul 16;25(14):7777. doi: 10.3390/ijms25147777.
2
Recent advancements in the physiological, genetic, and genomic research on s for trait improvement.近期在用于性状改良的[具体研究对象]的生理学、遗传学和基因组学研究方面取得的进展。 你提供的原文中“s”指代不明,以上是补充完整指代后的译文,你可根据实际情况调整。
3 Biotech. 2024 Jun;14(6):164. doi: 10.1007/s13205-024-04006-6. Epub 2024 May 26.
3
: A Pipeline for Identification of Polymorphic Microsatellites Loci within Assemblies of Related Species.
: 一种在相关物种的组装中识别多态微卫星基因座的方法。
Int J Mol Sci. 2024 Mar 9;25(6):3169. doi: 10.3390/ijms25063169.
4
Comprehensive Evaluation of Appreciation of Based on Analytic Hierarchy Process.基于层次分析法的[具体内容缺失]评价综合评估
Plants (Basel). 2024 Feb 19;13(4):558. doi: 10.3390/plants13040558.
5
Polly: An R package for genotyping microsatellites and detecting highly polymorphic DNA markers from short-read data.波利:一个用于从短读数据中对微卫星进行基因分型和检测高度多态性 DNA 标记的 R 包。
Mol Ecol Resour. 2024 May;24(4):e13933. doi: 10.1111/1755-0998.13933. Epub 2024 Feb 1.
6
Transposable elements: multifunctional players in the plant genome.转座元件:植物基因组中的多功能参与者。
Front Plant Sci. 2024 Jan 4;14:1330127. doi: 10.3389/fpls.2023.1330127. eCollection 2023.
7
MegaSSR: a web server for large scale microsatellite identification, classification, and marker development.MegaSSR:一个用于大规模微卫星识别、分类和标记开发的网络服务器。
Front Plant Sci. 2023 Dec 14;14:1219055. doi: 10.3389/fpls.2023.1219055. eCollection 2023.
8
Correction: Transcriptional insights of citrus defense response against Diaporthe citri.更正:柑橘对柑橘间座壳菌防御反应的转录见解。
BMC Plant Biol. 2023 Dec 27;23(1):666. doi: 10.1186/s12870-023-04694-x.
9
Genetic diversity analysis and variety identification using SSR and SNP markers in melon.利用 SSR 和 SNP 标记进行瓜类遗传多样性分析和品种鉴定。
BMC Plant Biol. 2023 Jan 18;23(1):39. doi: 10.1186/s12870-023-04056-7.
10
Plant invasions facilitated by suppression of root nutrient acquisition rather than by disruption of mycorrhizal association in the native plant.植物入侵是由抑制根系养分获取而非破坏本土植物的菌根共生关系所促成的。
Plant Divers. 2021 Dec 24;44(5):499-504. doi: 10.1016/j.pld.2021.12.004. eCollection 2022 Sep.