• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于SNP芯片准确识别纯种猪和杂交猪的深度学习策略。

A deep learning strategy for accurate identification of purebred and hybrid pigs across SNP chips.

作者信息

Zhang Zipeng, Fang Zhengwen, Du Yongwang, He Yilin, Qian Changsong, Ye Weijian, Zhang Ning, Zhang Jianan, Ding Xiangdong

机构信息

State Key Laboratory of Animal Biotech Breeding, Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.

MolBreeding Biotechnology Ltd., Shijiazhuang, 050035, China.

出版信息

J Anim Sci Biotechnol. 2025 Aug 14;16(1):116. doi: 10.1186/s40104-025-01249-y.

DOI:10.1186/s40104-025-01249-y
PMID:40813701
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12351870/
Abstract

BACKGROUND

Breed identification plays an important role in conserving indigenous breeds, managing genetic resources, and developing effective breeding strategies. However, researches on breed identification in livestock mainly focused on purebreds, and they yielded lower predict accuracy in hybrid. In this study, we presented a Multi-Layer Perceptron (MLP) model with multi-output regression framework specifically designed for genomic breed composition prediction of purebred and hybrid in pigs.

RESULTS

We utilized a total of 8,199 pigs from breeding farms in eight provinces in China, comprising Yorkshire, Landrace, Duroc and hybrids of Yorkshire × Landrace. All the animals were genotyped with 1K, 50K and 100K SNP chips. Comparing with random forest (RF), support vector regression (SVR) and Admixture, our results from five replicates of fivefold cross validation demonstrated that MLP achieved a breed identification accuracy of 100% for both hybrid and purebreds in 50K and 100K SNP chips, SVR performed comparable with MLP, they both outperformed RF and Admixture. In the independent testing, MLP yielded accuracy of 100% for all three pure breeds and hybrid across all SNP chips and panel, while SVR yielded 0.026%-0.121% lower accuracy than MLP. Compared with classification-based framework, the new strategy of multi-output regression framework in this study was helpful to improve the predict accuracy. MLP, RF and SVR, achieved consistent improvements across all six SNP chips/panel, especially in hybrid identification. Our results showed the determination threshold for purebred had different effects, SVR, RF and Admixture were very sensitive to threshold values, their optimal threshold fluctuated in different scenarios, while MLP kept optimal threshold 0.75 in all cases. The threshold of 0.65-0.75 is ideal for accurate breed identification. Among different density of SNP chips, the 1K SNP chip was most cost-effective as yielding 100% accuracy with enlarging training set. Hybrid individuals in the training set were useful for both purebred and hybrid identification.

CONCLUSIONS

Our new MLP strategy demonstrated its high accuracy and robust applicability across low-, medium-, and high-density SNP chips. Multi-output regression framework could universally enhance prediction accuracy for ML methods. Our new strategy is also helpful for breed identification in other livestock.

摘要

背景

品种鉴定在保护本土品种、管理遗传资源以及制定有效的育种策略方面发挥着重要作用。然而,家畜品种鉴定的研究主要集中在纯种上,在杂种中预测准确率较低。在本研究中,我们提出了一种具有多输出回归框架的多层感知器(MLP)模型,专门用于猪纯种和杂种的基因组品种组成预测。

结果

我们使用了来自中国八个省份养殖场的总共8199头猪,包括约克夏猪、长白猪、杜洛克猪以及约克夏×长白杂种猪。所有动物都用1K、50K和100K SNP芯片进行了基因分型。与随机森林(RF)、支持向量回归(SVR)和混合模型(Admixture)相比,我们五重交叉验证的五次重复结果表明,在50K和100K SNP芯片上,MLP对杂种和纯种的品种鉴定准确率均达到100%,SVR与MLP表现相当,它们均优于RF和Admixture。在独立测试中,MLP在所有SNP芯片和平板上对所有三个纯种和杂种的准确率均达到100%,而SVR的准确率比MLP低0.026%-0.121%。与基于分类的框架相比,本研究中的多输出回归框架新策略有助于提高预测准确率。MLP、RF和SVR在所有六个SNP芯片/平板上均实现了一致的提高,尤其是在杂种鉴定方面。我们的结果表明,纯种的判定阈值有不同影响,SVR、RF和Admixture对阈值非常敏感,它们的最佳阈值在不同情况下波动,而MLP在所有情况下的最佳阈值均保持为0.75。0.65-0.75的阈值对于准确的品种鉴定是理想的。在不同密度的SNP芯片中,1K SNP芯片最具成本效益,通过扩大训练集可达到100%的准确率。训练集中的杂种个体对纯种和杂种鉴定都有用。

结论

我们新的MLP策略在低、中、高密度SNP芯片上均展示了其高精度和强大的适用性。多输出回归框架可以普遍提高ML方法的预测准确率。我们的新策略也有助于其他家畜的品种鉴定。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/78934db7617b/40104_2025_1249_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/0b0a4ffff2a6/40104_2025_1249_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/6c055285ec08/40104_2025_1249_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/2e2491ad2f22/40104_2025_1249_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/78934db7617b/40104_2025_1249_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/0b0a4ffff2a6/40104_2025_1249_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/6c055285ec08/40104_2025_1249_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/2e2491ad2f22/40104_2025_1249_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d837/12351870/78934db7617b/40104_2025_1249_Fig4_HTML.jpg

相似文献

1
A deep learning strategy for accurate identification of purebred and hybrid pigs across SNP chips.一种基于SNP芯片准确识别纯种猪和杂交猪的深度学习策略。
J Anim Sci Biotechnol. 2025 Aug 14;16(1):116. doi: 10.1186/s40104-025-01249-y.
2
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
3
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.
4
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
5
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
6
Breed-specific heterosis for growth and carcass traits in 18 U.S. cattle breeds.美国18个牛品种生长和胴体性状的品种特异性杂种优势。
J Anim Sci. 2025 Jan 4;103. doi: 10.1093/jas/skaf048.
7
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
8
The Relationship Between Smartphone and Game Addiction, Leisure Time Management, and the Enjoyment of Physical Activity: A Comparison of Regression Analysis and Machine Learning Models.智能手机与游戏成瘾、休闲时间管理及体育活动乐趣之间的关系:回归分析与机器学习模型的比较
Healthcare (Basel). 2025 Jul 25;13(15):1805. doi: 10.3390/healthcare13151805.
9
Comprehensive duck DNA fingerprinting based on machine learning for breed identification.基于机器学习的综合鸭DNA指纹识别用于品种鉴定。
Poult Sci. 2025 May 29;104(8):105359. doi: 10.1016/j.psj.2025.105359.
10
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

本文引用的文献

1
Predicting Satisfaction With Chat-Counseling at a 24/7 Chat Hotline for the Youth: Natural Language Processing Study.预测青少年全天候聊天热线的聊天咨询满意度:自然语言处理研究。
JMIR AI. 2025 Feb 18;4:e63701. doi: 10.2196/63701.
2
Population structure and breed identification of Chinese indigenous sheep breeds using whole genome SNPs and InDels.利用全基因组 SNPs 和 InDels 对中国本土绵羊品种进行群体结构和品种鉴定。
Genet Sel Evol. 2024 Sep 3;56(1):60. doi: 10.1186/s12711-024-00927-1.
3
Improving the accuracy of genomic prediction in dairy cattle using the biologically annotated neural networks framework.
使用生物注释神经网络框架提高奶牛基因组预测的准确性。
J Anim Sci Biotechnol. 2024 Jul 1;15(1):87. doi: 10.1186/s40104-024-01044-1.
4
Assessing predictive performance of supervised machine learning algorithms for a diamond pricing model.评估用于钻石定价模型的监督式机器学习算法的预测性能。
Sci Rep. 2023 Oct 12;13(1):17315. doi: 10.1038/s41598-023-44326-w.
5
Evaluating the use of statistical and machine learning methods for estimating breed composition of purebred and crossbred animals in thirteen cattle breeds using genomic information.利用基因组信息评估统计和机器学习方法在13个牛品种中估计纯种和杂交动物品种组成的应用。
Front Genet. 2023 May 15;14:1120312. doi: 10.3389/fgene.2023.1120312. eCollection 2023.
6
Breed identification using breed-informative SNPs and machine learning based on whole genome sequence data and SNP chip data.利用品种信息性单核苷酸多态性(SNP)以及基于全基因组序列数据和SNP芯片数据的机器学习进行品种鉴定。
J Anim Sci Biotechnol. 2023 Jun 1;14(1):85. doi: 10.1186/s40104-023-00880-x.
7
The use of a genomic relationship matrix for breed assignment of cattle breeds: comparison and combination with a machine learning method.利用基因组关系矩阵对牛品种进行品种归属:与机器学习方法的比较和结合。
J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad172.
8
A web tool for the global identification of pig breeds.一个用于全球猪品种识别的网络工具。
Genet Sel Evol. 2023 Mar 21;55(1):18. doi: 10.1186/s12711-023-00788-0.
9
Evaluation of six machine learning classification algorithms in pig breed identification using SNPs array data.基于 SNP 芯片数据的六种机器学习分类算法在猪品种鉴定中的评估。
Anim Genet. 2023 Apr;54(2):113-122. doi: 10.1111/age.13279. Epub 2022 Dec 2.
10
Using Machine Learning to Identify Biomarkers Affecting Fat Deposition in Pigs by Integrating Multisource Transcriptome Information.利用机器学习整合多源转录组信息鉴定影响猪脂肪沉积的生物标志物。
J Agric Food Chem. 2022 Aug 24;70(33):10359-10370. doi: 10.1021/acs.jafc.2c03339. Epub 2022 Aug 11.