• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用深度学习技术检测区分马品种的 SNP 标记。

Detecting SNP markers discriminating horse breeds by deep learning.

机构信息

Department of Animal Science, Faculty of Agriculture and Natural Resources, Arak University, Arak, Iran.

出版信息

Sci Rep. 2023 Jul 18;13(1):11592. doi: 10.1038/s41598-023-38601-z.

DOI:10.1038/s41598-023-38601-z
PMID:37464049
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10354035/
Abstract

The assignment of an individual to the true population of origin using a low-panel of discriminant SNP markers is one of the most important applications of genomic data for practical use. The aim of this study was to evaluate the potential of different Artificial Neural Networks (ANNs) approaches consisting Deep Neural Networks (DNN), Garson and Olden methods for feature selection of informative SNP markers from high-throughput genotyping data, that would be able to trace the true breed of unknown samples. The total of 795 animals from 37 breeds, genotyped by using the Illumina SNP 50k Bead chip were used in the current study and principal component analysis (PCA), log-likelihood ratios (LLR) and Neighbor-Joining (NJ) were applied to assess the performance of different assignment methods. The results revealed that the DNN, Garson, and Olden methods are able to assign individuals to true populations with 4270, 4937, and 7999 SNP markers, respectively. The PCA was used to determine how the animals allocated to the groups using all genotyped markers available on 50k Bead chip and the subset of SNP markers identified with different methods. The results indicated that all SNP panels are able to assign individuals into their true breeds. The success percentage of genetic assignment for different methods assessed by different levels of LLR showed that the success rate of 70% in the analysis was obtained by three methods with the number of markers of 110, 208, and 178 tags for DNN, Garson, and Olden methods, respectively. Also the results showed that DNN performed better than other two approaches by achieving 93% accuracy at the most stringent threshold. Finally, the identified SNPs were successfully used in independent out-group breeds consisting 120 individuals from eight breeds and the results indicated that these markers are able to correctly allocate all unknown samples to true population of origin. Furthermore, the NJ tree of allele-sharing distances on the validation dataset showed that the DNN has a high potential for feature selection. In general, the results of this study indicated that the DNN technique represents an efficient strategy for selecting a reduced pool of highly discriminant markers for assigning individuals to the true population of origin.

摘要

使用低面板判别 SNP 标记将个体分配到真实的种群起源是基因组数据实际应用中最重要的应用之一。本研究的目的是评估不同人工神经网络(ANN)方法的潜力,这些方法包括深度神经网络(DNN)、Garson 和 Olden 方法,用于从高通量基因分型数据中选择信息性 SNP 标记,以便能够追踪未知样本的真实品种。本研究共使用了 37 个品种的 795 只动物,这些动物使用 Illumina SNP 50k Bead 芯片进行了基因分型,并应用主成分分析(PCA)、对数似然比(LLR)和邻接法(NJ)来评估不同分配方法的性能。结果表明,DNN、Garson 和 Olden 方法分别能够使用 4270、4937 和 7999 个 SNP 标记将个体分配到真实种群中。PCA 用于确定使用 50k Bead 芯片上可用的所有基因分型标记和不同方法确定的 SNP 标记子集,动物如何分配到组中。结果表明,所有 SNP 面板都能够将个体分配到其真实品种。通过不同水平的 LLR 评估不同方法的遗传分配成功率表明,在分析中,三种方法的成功率为 70%,使用的标记数分别为 110、208 和 178 个标记,用于 DNN、Garson 和 Olden 方法。此外,结果表明,DNN 通过在最严格的阈值下达到 93%的准确率,表现优于其他两种方法。最后,成功地将鉴定的 SNP 用于由来自八个品种的 120 只个体组成的独立外群品种,结果表明这些标记能够正确地将所有未知样本分配到真实的起源种群。此外,在验证数据集上的等位基因共享距离 NJ 树显示,DNN 具有很高的特征选择潜力。总的来说,本研究的结果表明,DNN 技术代表了一种有效的策略,用于选择一组减少的高度判别标记,以便将个体分配到真实的起源种群。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/4a4af654ffb3/41598_2023_38601_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/0c6d0a69efa1/41598_2023_38601_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/cf4b41597389/41598_2023_38601_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/b516269028df/41598_2023_38601_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/86844f006a5e/41598_2023_38601_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/c7f5dac69530/41598_2023_38601_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/4a4af654ffb3/41598_2023_38601_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/0c6d0a69efa1/41598_2023_38601_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/cf4b41597389/41598_2023_38601_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/b516269028df/41598_2023_38601_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/86844f006a5e/41598_2023_38601_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/c7f5dac69530/41598_2023_38601_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a37/10354035/4a4af654ffb3/41598_2023_38601_Fig6_HTML.jpg

相似文献

1
Detecting SNP markers discriminating horse breeds by deep learning.利用深度学习技术检测区分马品种的 SNP 标记。
Sci Rep. 2023 Jul 18;13(1):11592. doi: 10.1038/s41598-023-38601-z.
2
Comparison of three statistical approaches for feature selection for fine-scale genetic population assignment in four pig breeds.四种猪品种中用于精细尺度遗传群体分配的特征选择的三种统计方法的比较。
Trop Anim Health Prod. 2021 Jul 10;53(3):395. doi: 10.1007/s11250-021-02824-x.
3
Preselection statistics and Random Forest classification identify population informative single nucleotide polymorphisms in cosmopolitan and autochthonous cattle breeds.预选统计和随机森林分类确定了世界性和土着牛品种中具有群体信息的单核苷酸多态性。
Animal. 2018 Jan;12(1):12-19. doi: 10.1017/S1751731117001355. Epub 2017 Jun 23.
4
Evaluation of approaches for identifying population informative markers from high density SNP chips.评价从高密度 SNP 芯片中识别群体信息标记的方法。
BMC Genet. 2011 May 13;12:45. doi: 10.1186/1471-2156-12-45.
5
A machine learning approach for the identification of population-informative markers from high-throughput genotyping data: application to several pig breeds.一种从高通量基因分型数据中识别群体信息标记的机器学习方法:在多个猪品种中的应用。
Animal. 2020 Feb;14(2):223-232. doi: 10.1017/S1751731119002167. Epub 2019 Oct 11.
6
The use of a genomic relationship matrix for breed assignment of cattle breeds: comparison and combination with a machine learning method.利用基因组关系矩阵对牛品种进行品种归属:与机器学习方法的比较和结合。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad172.
7
Combined use of principal component analysis and random forests identify population-informative single nucleotide polymorphisms: application in cattle breeds.主成分分析与随机森林的联合使用可识别群体信息单核苷酸多态性:在牛品种中的应用
J Anim Breed Genet. 2015 Oct;132(5):346-56. doi: 10.1111/jbg.12155. Epub 2015 Mar 17.
8
Accuracy of genotype imputation in sheep breeds.绵羊品种基因型推断的准确性。
Anim Genet. 2012 Feb;43(1):72-80. doi: 10.1111/j.1365-2052.2011.02208.x. Epub 2011 May 27.
9
High imputation accuracy from informative low-to-medium density single nucleotide polymorphism genotypes is achievable in sheep1.在绵羊中,信息量较低且中等密度的单核苷酸多态性基因型也能实现高的插补准确性。1
J Anim Sci. 2019 Apr 3;97(4):1550-1567. doi: 10.1093/jas/skz043.
10
High-density marker imputation accuracy in sixteen French cattle breeds.十六个法国牛种高密度标记的估计准确度。
Genet Sel Evol. 2013 Sep 3;45(1):33. doi: 10.1186/1297-9686-45-33.

引用本文的文献

1
RASEL: An Ensemble Model for Selection of Core SNPs and Its Application for Identification and Classification of Cattle Breeds.RASEL:一种用于选择核心单核苷酸多态性的集成模型及其在牛品种鉴定和分类中的应用
Biochem Genet. 2025 Aug 22. doi: 10.1007/s10528-025-11230-z.
2
Machine learning techniques for continuous genetic assignment of geographic origin of forest trees.用于连续遗传分配林木地理起源的机器学习技术
PLoS One. 2025 Jun 6;20(6):e0324994. doi: 10.1371/journal.pone.0324994. eCollection 2025.

本文引用的文献

1
DeepPhos: prediction of protein phosphorylation sites with deep learning.DeepPhos:利用深度学习预测蛋白质磷酸化位点
Bioinformatics. 2019 Aug 15;35(16):2766-2773. doi: 10.1093/bioinformatics/bty1051.
2
NeuralNetTools: Visualization and Analysis Tools for Neural Networks.神经网络工具:用于神经网络的可视化和分析工具。
J Stat Softw. 2018;85(11):1-20. doi: 10.18637/jss.v085.i11.
3
Genomic Prediction of Breeding Values Using a Subset of SNPs Identified by Three Machine Learning Methods.使用三种机器学习方法鉴定的单核苷酸多态性(SNP)子集对育种值进行基因组预测。
Front Genet. 2018 Jul 4;9:237. doi: 10.3389/fgene.2018.00237. eCollection 2018.
4
DEEPre: sequence-based enzyme EC number prediction by deep learning.DEEPre:基于深度学习的酶 EC 号序列预测。
Bioinformatics. 2018 Mar 1;34(5):760-769. doi: 10.1093/bioinformatics/btx680.
5
Developing a 670k genotyping array to tag ~2M SNPs across 24 horse breeds.开发一个670k基因分型阵列,以标记24个马品种中的约200万个单核苷酸多态性(SNP)。
BMC Genomics. 2017 Jul 27;18(1):565. doi: 10.1186/s12864-017-3943-8.
6
Deep learning in bioinformatics.生物信息学中的深度学习。
Brief Bioinform. 2017 Sep 1;18(5):851-869. doi: 10.1093/bib/bbw068.
7
The sequence of sequencers: The history of sequencing DNA.测序仪的序列:DNA测序的历史。
Genomics. 2016 Jan;107(1):1-8. doi: 10.1016/j.ygeno.2015.11.003. Epub 2015 Nov 10.
8
Genetic diversity in the modern horse illustrated from genome-wide SNP data.基于全基因组 SNP 数据解析现代马的遗传多样性。
PLoS One. 2013;8(1):e54997. doi: 10.1371/journal.pone.0054997. Epub 2013 Jan 30.
9
Use of the canonical discriminant analysis to select SNP markers for bovine breed assignment and traceability purposes.使用典型判别分析来选择用于牛品种鉴定和可追溯性目的的单核苷酸多态性(SNP)标记。
Anim Genet. 2013 Aug;44(4):377-82. doi: 10.1111/age.12021. Epub 2013 Jan 24.
10
Deep architectures for protein contact map prediction.用于蛋白质接触图预测的深度架构。
Bioinformatics. 2012 Oct 1;28(19):2449-57. doi: 10.1093/bioinformatics/bts475. Epub 2012 Jul 30.