• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

变异效应的计算机模拟预测:精准植物育种的前景与局限

In silico prediction of variant effects: promises and limitations for precision plant breeding.

作者信息

Sendrowski Janek, Bataillon Thomas, Ramstein Guillaume P

机构信息

Bioinformatics Research Center, Aarhus University, 8000, Aarhus, Denmark.

Center for Quantitative Genetics and Genomics, Aarhus University, 8000, Aarhus, Denmark.

出版信息

Theor Appl Genet. 2025 Jul 28;138(8):193. doi: 10.1007/s00122-025-04973-1.

DOI:10.1007/s00122-025-04973-1
PMID:40719915
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12304032/
Abstract

Sequence-based AI models show great potential for prediction of variant effects at high resolution, but their practical value in plant breeding remains to be confirmed through rigorous validation studies. Plant breeding has traditionally relied on phenotyping to select individuals with desirable traits-a process that is both costly and time-consuming. Increasingly, breeding strategies are shifting toward precision breeding, where causal variants are directly targeted based on their effects. To predict the effects of causal variants, in silico methods are emerging as efficient alternatives or complements to mutagenesis screens. Here, we review state-of-the-art machine learning methods for predicting variant effects in plants across both coding and noncoding regions, contrasting supervised approaches in functional genomics with unsupervised methods in comparative genomics. We discuss challenges in validating predictions, and compare these methods with traditional association and comparative genomics techniques. We argue that modern sequence models extend traditional methods by generalizing across genomic contexts, fitting a unified model across loci rather than a separate model for each locus. In doing so, they address inherent limitations of traditional quantitative and evolutionary comparative genetics techniques. However, the accuracy and generalizability of sequence models heavily depend on the training data, highlighting the need for validation experiments. We point to successful applications of sequence models, especially with protein sequences, and identify areas for further improvement, especially in modeling regulatory sequences. While not yet mature for in silico-driven precision breeding, sequence models show strong potential to become an integral part of the breeder's toolbox.

摘要

基于序列的人工智能模型在高分辨率预测变异效应方面显示出巨大潜力,但其在植物育种中的实际价值仍有待通过严格的验证研究来证实。传统上,植物育种依赖于表型分析来选择具有理想性状的个体,这一过程既昂贵又耗时。越来越多的育种策略正朝着精准育种转变,即根据因果变异的效应直接靶向这些变异。为了预测因果变异的效应,计算机模拟方法正成为诱变筛选的有效替代方法或补充方法。在这里,我们综述了用于预测植物编码区和非编码区变异效应的最新机器学习方法,对比了功能基因组学中的监督方法和比较基因组学中的无监督方法。我们讨论了验证预测结果时面临的挑战,并将这些方法与传统的关联分析和比较基因组学技术进行了比较。我们认为,现代序列模型通过在基因组背景下进行泛化,扩展了传统方法,为各个位点拟合一个统一的模型,而不是为每个位点分别拟合一个模型。通过这样做,它们解决了传统定量和进化比较遗传学技术的固有局限性。然而,序列模型的准确性和泛化性在很大程度上取决于训练数据,这凸显了验证实验的必要性。我们指出了序列模型的成功应用,特别是在蛋白质序列方面,并确定了需要进一步改进的领域,尤其是在调控序列建模方面。虽然在计算机模拟驱动的精准育种方面还不成熟,但序列模型显示出强大的潜力,有望成为育种工具箱中不可或缺的一部分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/0edf0d75357f/122_2025_4973_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/c6e7aba48c38/122_2025_4973_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/d6160b7f104c/122_2025_4973_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/0edf0d75357f/122_2025_4973_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/c6e7aba48c38/122_2025_4973_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/d6160b7f104c/122_2025_4973_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3fb5/12304032/0edf0d75357f/122_2025_4973_Fig3_HTML.jpg

相似文献

1
In silico prediction of variant effects: promises and limitations for precision plant breeding.变异效应的计算机模拟预测:精准植物育种的前景与局限
Theor Appl Genet. 2025 Jul 28;138(8):193. doi: 10.1007/s00122-025-04973-1.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Breeding perspectives on tackling trait genome-to-phenome (G2P) dimensionality using ensemble-based genomic prediction.利用基于集成的基因组预测解决性状基因组到表型(G2P)维度问题的育种前景。
Theor Appl Genet. 2025 Jul 4;138(7):172. doi: 10.1007/s00122-025-04960-6.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
6
Short-Term Memory Impairment短期记忆障碍
7
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
8
Digital interventions in mental health: evidence syntheses and economic modelling.数字干预在精神健康中的应用:证据综合和经济建模。
Health Technol Assess. 2022 Jan;26(1):1-182. doi: 10.3310/RCTI6942.
9
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
10
Approaches for predicting dairy cattle methane emissions: from traditional methods to machine learning.预测奶牛甲烷排放的方法:从传统方法到机器学习。
J Anim Sci. 2024 Jan 3;102. doi: 10.1093/jas/skae219.

本文引用的文献

1
Genetic variation at transcription factor binding sites largely explains phenotypic heritability in maize.转录因子结合位点的遗传变异在很大程度上解释了玉米的表型遗传性。
Nat Genet. 2025 Aug 11. doi: 10.1038/s41588-025-02246-7.
2
Soil nitrogen drives inverse acclimation of xylem growth cessation to rising temperature in Northern Hemisphere conifers.土壤氮素驱动北半球针叶树木质部生长停止对气温上升的反向驯化。
Proc Natl Acad Sci U S A. 2025 Jul 29;122(30):e2421834122. doi: 10.1073/pnas.2421834122. Epub 2025 Jul 24.
3
Evaluating the representational power of pre-trained DNA language models for regulatory genomics.
评估预训练DNA语言模型在调控基因组学中的表征能力。
Genome Biol. 2025 Jul 14;26(1):203. doi: 10.1186/s13059-025-03674-8.
4
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling.墨丘利神杖:双向等变远程DNA序列建模
Proc Mach Learn Res. 2024 Jul;235:43632-43648.
5
Cross-species modeling of plant genomes at single-nucleotide resolution using a pretrained DNA language model.使用预训练的DNA语言模型在单核苷酸分辨率下对植物基因组进行跨物种建模。
Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2421738122. doi: 10.1073/pnas.2421738122. Epub 2025 Jun 9.
6
Predicting expression-altering promoter mutations with deep learning.利用深度学习预测改变表达的启动子突变。
Science. 2025 Aug 7;389(6760):eads7373. doi: 10.1126/science.ads7373.
7
The genetic architecture of cell type-specific cis regulation in maize.玉米中细胞类型特异性顺式调控的遗传结构
Science. 2025 Apr 18;388(6744):eads6601. doi: 10.1126/science.ads6601.
8
GENA-LM: a family of open-source foundational DNA language models for long sequences.GENA-LM:用于长序列的开源基础DNA语言模型家族。
Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkae1310.
9
Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation.将DNA序列预测RNA测序覆盖度作为基因调控的统一模型。
Nat Genet. 2025 Apr;57(4):949-961. doi: 10.1038/s41588-024-02053-6. Epub 2025 Jan 8.
10
Genomic language models: opportunities and challenges.基因组语言模型:机遇与挑战。
Trends Genet. 2025 Apr;41(4):286-302. doi: 10.1016/j.tig.2024.11.013. Epub 2025 Jan 2.