Suppr超能文献

通过可解释的机器学习模型解析肌肉发育的基因表达和可变剪接基础

Deciphering the Gene Expression and Alternative Splicing Basis of Muscle Development Through Interpretable Machine Learning Models.

作者信息

Tan Xiaodong, Huang Minjie, Jin Yuting, Li Jiahua, Dong Jie, Wang Deqian

机构信息

Institute of Animal Husbandry and Veterinary Science, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China.

Key Laboratory of Livestock and Poultry Resources (Poultry) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Hangzhou 310021, China.

出版信息

Biology (Basel). 2025 Aug 15;14(8):1059. doi: 10.3390/biology14081059.

Abstract

In chickens, meat yield is a crucial trait in breeding programs. Identifying key molecular markers associated with increased muscle yield is essential for breeding strategies. This study applied transcriptome sequencing and machine learning methods to examine gene expression and alternative splicing (AS) events in muscle tissues of commercial broilers and local chickens. On the basis of differentially expressed genes (DEGs) and differentially spliced transcripts (DSTs) significantly related to breast muscle weight percentage (BrP), high-accuracy prediction models were developed by evaluating 10 machine learning models (e.g., eXtreme Gradient Boosting (XGBoost), Generalized Linear Model Network (Glmnet)). Feature importance was assessed using the Shapley Additive exPlanations (SHAP) method. The results revealed that 50 DEGs and 95 DSTs contributed significantly to BrP prediction. The XGBoost model achieved over 90% accuracy when using DEGs, and the Glmnet model reached 95% accuracy when using DSTs. Through Shapley evaluation, genes and AS events (e.g., ENSGALG00010012060, HINTW, and VIPR2-201) were identified as having the highest contributions to BrP prediction. Additionally, the breed effect was effectively mitigated. This study introduces new candidate genes and AS targets for the molecular breeding of poultry breast muscle traits, offering a paradigm shift from traditional gene mining approaches to artificial intelligence-driven predictive methods.

摘要

在鸡中,肉产量是育种计划中的关键性状。识别与肌肉产量增加相关的关键分子标记对于育种策略至关重要。本研究应用转录组测序和机器学习方法来检测商业肉鸡和本地鸡肌肉组织中的基因表达和可变剪接(AS)事件。基于与胸肌重量百分比(BrP)显著相关的差异表达基因(DEG)和差异剪接转录本(DST),通过评估10种机器学习模型(例如,极端梯度提升(XGBoost)、广义线性模型网络(Glmnet))开发了高精度预测模型。使用Shapley加性解释(SHAP)方法评估特征重要性。结果表明,50个DEG和95个DST对BrP预测有显著贡献。使用DEG时,XGBoost模型的准确率超过90%,使用DST时,Glmnet模型的准确率达到95%。通过Shapley评估,确定了对BrP预测贡献最大的基因和AS事件(例如,ENSGALG00010012060、HINTW和VIPR2-201)。此外,有效减轻了品种效应。本研究为家禽胸肌性状的分子育种引入了新的候选基因和AS靶点,从传统的基因挖掘方法转向人工智能驱动的预测方法,提供了一种范式转变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/471f/12383657/bbbcaf160464/biology-14-01059-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验