• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习作为一种有效方法,用于鉴定多倍体植物中的真正单核苷酸多态性。

Machine Learning as an Effective Method for Identifying True Single Nucleotide Polymorphisms in Polyploid Plants.

出版信息

Plant Genome. 2019 Mar;12(1). doi: 10.3835/plantgenome2018.05.0023.

DOI:10.3835/plantgenome2018.05.0023
PMID:30951095
Abstract

Single nucleotide polymorphisms (SNPs) have many advantages as molecular markers since they are ubiquitous and codominant. However, the discovery of true SNPs in polyploid species is difficult. Peanut ( L.) is an allopolyploid, which has a very low rate of true SNP calling. A large set of true and false SNPs identified from the Axiom_ 58k array was leveraged to train machine-learning models to enable identification of true SNPs directly from sequence data to reduce ascertainment bias. These models achieved accuracy rates above 80% using real peanut RNA sequencing (RNA-seq) and whole-genome shotgun (WGS) resequencing data, which is higher than previously reported for polyploids and at least a twofold improvement for peanut. A 48K SNP array, Axiom_2, was designed using this approach resulting in 75% accuracy of calling SNPs from different tetraploid peanut genotypes. Using the method to simulate SNP variation in several polyploids, models achieved >98% accuracy in selecting true SNPs. Additionally, models built with simulated genotypes were able to select true SNPs at >80% accuracy using real peanut data. This work accomplished the objective to create an effective approach for calling highly reliable SNPs from polyploids using machine learning. A novel tool was developed for predicting true SNPs from sequence data, designated as SNP machine learning (SNP-ML), using the described models. The SNP-ML additionally provides functionality to train new models not included in this study for customized use, designated SNP machine learner (SNP-MLer). The SNP-ML is publicly available.

摘要

单核苷酸多态性(SNPs)作为分子标记具有许多优势,因为它们普遍存在且为共显性。然而,在多倍体物种中发现真正的 SNPs 是很困难的。花生(L.)是一种异源多倍体,其真正 SNP 的发现率非常低。利用从 Axiom_58k 阵列中鉴定出的大量真实和虚假 SNPs,训练机器学习模型,以便直接从序列数据中识别真正的 SNPs,从而减少确定偏差。这些模型使用真实的花生 RNA 测序(RNA-seq)和全基因组鸟枪法(WGS)重测序数据实现了 80%以上的准确率,高于之前报道的多倍体,并且至少提高了花生的两倍。使用这种方法设计了 48K SNP 阵列 Axiom_2,从不同的四倍体花生基因型中调用 SNP 的准确率达到 75%。使用该方法模拟几种多倍体中的 SNP 变异,模型在选择真正的 SNP 时准确率达到>98%。此外,使用真实的花生数据,使用模拟基因型构建的模型能够以>80%的准确率选择真正的 SNP。这项工作实现了使用机器学习从多倍体中调用高度可靠的 SNPs 的有效方法的目标。开发了一种新的工具,用于从序列数据中预测真正的 SNP,称为 SNP 机器学习(SNP-ML),使用描述的模型。SNP-ML 还提供了功能,可以为未包含在本研究中的新模型进行定制化训练,指定为 SNP 机器学习器(SNP-MLer)。SNP-ML 是公开可用的。

相似文献

1
Machine Learning as an Effective Method for Identifying True Single Nucleotide Polymorphisms in Polyploid Plants.机器学习作为一种有效方法,用于鉴定多倍体植物中的真正单核苷酸多态性。
Plant Genome. 2019 Mar;12(1). doi: 10.3835/plantgenome2018.05.0023.
2
Single Nucleotide Polymorphism Identification in Polyploids: A Review, Example, and Recommendations.多倍体中单核苷酸多态性的鉴定:综述、实例与建议。
Mol Plant. 2015 Jun;8(6):831-46. doi: 10.1016/j.molp.2015.02.002. Epub 2015 Feb 10.
3
Haplotype-Based Genotyping in Polyploids.多倍体中基于单倍型的基因分型
Front Plant Sci. 2018 Apr 26;9:564. doi: 10.3389/fpls.2018.00564. eCollection 2018.
4
Molecular marker development from transcript sequences and germplasm evaluation for cultivated peanut (Arachis hypogaea L.).基于转录序列的栽培花生(Arachis hypogaea L.)分子标记开发及种质评价
Mol Genet Genomics. 2016 Feb;291(1):363-81. doi: 10.1007/s00438-015-1115-6. Epub 2015 Sep 11.
5
Next-generation transcriptome sequencing, SNP discovery and validation in four market classes of peanut, Arachis hypogaea L.花生(Arachis hypogaea L.)四个市场类型的下一代转录组测序、单核苷酸多态性发现与验证
Mol Genet Genomics. 2015 Jun;290(3):1169-80. doi: 10.1007/s00438-014-0976-4. Epub 2015 Feb 7.
6
Target Amplicon Sequencing for Genotyping Genome-Wide Single Nucleotide Polymorphisms Identified by Whole-Genome Resequencing in Peanut.花生全基因组重测序鉴定的全基因组单核苷酸多态性的靶标扩增子测序。
Plant Genome. 2016 Nov;9(3). doi: 10.3835/plantgenome2016.06.0052.
7
Target enrichment sequencing in cultivated peanut (Arachis hypogaea L.) using probes designed from transcript sequences.利用从转录本序列设计的探针,对栽培花生(Arachis hypogaea L.)进行目标富集测序。
Mol Genet Genomics. 2017 Oct;292(5):955-965. doi: 10.1007/s00438-017-1327-z. Epub 2017 May 10.
8
Comparison of SNP Calling Pipelines and NGS Platforms to Predict the Genomic Regions Harboring Candidate Genes for Nodulation in Cultivated Peanut.用于预测栽培花生中与结瘤相关候选基因所在基因组区域的SNP检测流程和NGS平台的比较
Front Genet. 2020 Mar 24;11:222. doi: 10.3389/fgene.2020.00222. eCollection 2020.
9
SWEEP: A Tool for Filtering High-Quality SNPs in Polyploid Crops.SWEEP:一种用于筛选多倍体作物中高质量单核苷酸多态性的工具。
G3 (Bethesda). 2015 Jul 6;5(9):1797-803. doi: 10.1534/g3.115.019703.
10
Development and Evaluation of a High Density Genotyping 'Axiom_Arachis' Array with 58 K SNPs for Accelerating Genetics and Breeding in Groundnut.发展和评估一个高密度基因分型 'Axiom_Arachis' 芯片,包含 58,000 个 SNP,用于加速花生的遗传和育种。
Sci Rep. 2017 Jan 16;7:40577. doi: 10.1038/srep40577.

引用本文的文献

1
Mapping QTLs for early leaf spot resistance and yield component traits using an interspecific AB-QTL population in peanut.利用花生种间AB-QTL群体定位早叶斑病抗性和产量构成性状的QTL
Front Plant Sci. 2025 Jan 16;15:1488166. doi: 10.3389/fpls.2024.1488166. eCollection 2024.
2
Understanding the impacts of drought on peanuts L.): exploring physio-genetic mechanisms to develop drought-resilient peanut cultivars.了解干旱对花生(Arachis hypogaea L.)的影响:探索生理遗传机制以培育耐旱花生品种。
Front Genet. 2025 Jan 8;15:1492434. doi: 10.3389/fgene.2024.1492434. eCollection 2024.
3
AutoXAI4Omics: an automated explainable AI tool for omics and tabular data.
AutoXAI4Omics:用于组学和表格数据的自动化可解释 AI 工具。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae593.
4
Genetic characterization and mapping of the shell-strength trait in peanut.花生壳强度性状的遗传特征分析与定位
BMC Plant Biol. 2024 Nov 5;24(1):1047. doi: 10.1186/s12870-024-05727-9.
5
Using cross-country datasets for association mapping in Arachis hypogaea L.利用跨国数据集在花生(Arachis hypogaea L.)中进行关联作图
Plant Genome. 2024 Dec;17(4):e20515. doi: 10.1002/tpg2.20515. Epub 2024 Oct 15.
6
Harnessing the power of machine learning for crop improvement and sustainable production.利用机器学习的力量促进作物改良和可持续生产。
Front Plant Sci. 2024 Aug 12;15:1417912. doi: 10.3389/fpls.2024.1417912. eCollection 2024.
7
The groundnut improvement network for Africa (GINA) germplasm collection: a unique genetic resource for breeding and gene discovery.非洲花生育种改良网络(GINA)种质资源收集:用于培育和基因发现的独特遗传资源。
G3 (Bethesda). 2023 Dec 29;14(1). doi: 10.1093/g3journal/jkad244.
8
Demographic history inference and the polyploid continuum.人口历史推断与多倍体连续统。
Genetics. 2023 Aug 9;224(4). doi: 10.1093/genetics/iyad107.
9
Marker-assisted introgression of wild chromosome segments conferring resistance to fungal foliar diseases into peanut ( L.).将赋予花生(L.)对叶部真菌病害抗性的野生染色体片段进行标记辅助渐渗。
Front Plant Sci. 2023 Mar 17;14:1139361. doi: 10.3389/fpls.2023.1139361. eCollection 2023.
10
Genome-wide association studies reveal novel loci for resistance to groundnut rosette disease in the African core groundnut collection.全基因组关联研究揭示了非洲核心花生收集物中抗花生卷叶病的新位点。
Theor Appl Genet. 2023 Mar 10;136(3):35. doi: 10.1007/s00122-023-04259-4.