Suppr超能文献

通过可解释的机器学习模型解析肌肉发育的基因表达和可变剪接基础

Deciphering the Gene Expression and Alternative Splicing Basis of Muscle Development Through Interpretable Machine Learning Models.

作者信息

Tan Xiaodong, Huang Minjie, Jin Yuting, Li Jiahua, Dong Jie, Wang Deqian

机构信息

Institute of Animal Husbandry and Veterinary Science, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China.

Key Laboratory of Livestock and Poultry Resources (Poultry) Evaluation and Utilization, Ministry of Agriculture and Rural Affairs, Hangzhou 310021, China.

出版信息

Biology (Basel). 2025 Aug 15;14(8):1059. doi: 10.3390/biology14081059.

Abstract

In chickens, meat yield is a crucial trait in breeding programs. Identifying key molecular markers associated with increased muscle yield is essential for breeding strategies. This study applied transcriptome sequencing and machine learning methods to examine gene expression and alternative splicing (AS) events in muscle tissues of commercial broilers and local chickens. On the basis of differentially expressed genes (DEGs) and differentially spliced transcripts (DSTs) significantly related to breast muscle weight percentage (BrP), high-accuracy prediction models were developed by evaluating 10 machine learning models (e.g., eXtreme Gradient Boosting (XGBoost), Generalized Linear Model Network (Glmnet)). Feature importance was assessed using the Shapley Additive exPlanations (SHAP) method. The results revealed that 50 DEGs and 95 DSTs contributed significantly to BrP prediction. The XGBoost model achieved over 90% accuracy when using DEGs, and the Glmnet model reached 95% accuracy when using DSTs. Through Shapley evaluation, genes and AS events (e.g., ENSGALG00010012060, HINTW, and VIPR2-201) were identified as having the highest contributions to BrP prediction. Additionally, the breed effect was effectively mitigated. This study introduces new candidate genes and AS targets for the molecular breeding of poultry breast muscle traits, offering a paradigm shift from traditional gene mining approaches to artificial intelligence-driven predictive methods.

摘要

在鸡中,肉产量是育种计划中的关键性状。识别与肌肉产量增加相关的关键分子标记对于育种策略至关重要。本研究应用转录组测序和机器学习方法来检测商业肉鸡和本地鸡肌肉组织中的基因表达和可变剪接(AS)事件。基于与胸肌重量百分比(BrP)显著相关的差异表达基因(DEG)和差异剪接转录本(DST),通过评估10种机器学习模型(例如,极端梯度提升(XGBoost)、广义线性模型网络(Glmnet))开发了高精度预测模型。使用Shapley加性解释(SHAP)方法评估特征重要性。结果表明,50个DEG和95个DST对BrP预测有显著贡献。使用DEG时,XGBoost模型的准确率超过90%,使用DST时,Glmnet模型的准确率达到95%。通过Shapley评估,确定了对BrP预测贡献最大的基因和AS事件(例如,ENSGALG00010012060、HINTW和VIPR2-201)。此外,有效减轻了品种效应。本研究为家禽胸肌性状的分子育种引入了新的候选基因和AS靶点,从传统的基因挖掘方法转向人工智能驱动的预测方法,提供了一种范式转变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/471f/12383657/bbbcaf160464/biology-14-01059-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验