Suppr超能文献

利用基因表达和机器学习预测与健身相关的特征。

Predicting Fitness-Related Traits Using Gene Expression and Machine Learning.

作者信息

Henry Georgia A, Stinchcombe John R

机构信息

Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada.

Koffler Scientific Reserve at Joker's Hill, University of Toronto, King, ON, Canada.

出版信息

Genome Biol Evol. 2025 Feb 3;17(2). doi: 10.1093/gbe/evae275.

Abstract

Evolution by natural selection occurs at its most basic through the change in frequencies of alleles; connecting those genomic targets to phenotypic selection is an important goal for evolutionary biology in the genomics era. The relative abundance of gene products expressed in a tissue can be considered a phenotype intermediate to the genes and genomic regulatory elements themselves and more traditionally measured macroscopic phenotypic traits such as flowering time, size, or growth. The high dimensionality, low sample size nature of transcriptomic sequence data is a double-edged sword, however, as it provides abundant information but makes traditional statistics difficult. Machine learning (ML) has many features which handle high-dimensional data well and is thus useful in genetic sequence applications. Here, we examined the association of fitness components with gene expression data in Ipomoea hederacea (Ivyleaf morning glory) grown under field conditions. We combine the results of two different ML approaches and find evidence that expression of photosynthesis-related genes is likely under selection. We also find that genes related to stress and light responses were overall important in predicting fitness. With this study, we demonstrate the utility of ML models for smaller samples and their potential application for understanding natural selection.

摘要

自然选择驱动的进化最基本的发生方式是通过等位基因频率的改变;在基因组时代,将这些基因组靶点与表型选择联系起来是进化生物学的一个重要目标。在组织中表达的基因产物的相对丰度可以被视为介于基因和基因组调控元件之间的一种表型,更传统的是测量宏观表型特征,如开花时间、大小或生长情况。然而,转录组序列数据的高维度、小样本量特性是一把双刃剑,因为它提供了丰富的信息,但也使得传统统计学方法难以应用。机器学习(ML)具有许多能够很好地处理高维数据的特性,因此在基因序列应用中很有用。在这里,我们研究了在田间条件下生长的圆叶牵牛中适合度成分与基因表达数据之间的关联。我们结合了两种不同机器学习方法的结果,发现有证据表明光合作用相关基因的表达可能受到选择。我们还发现,与应激和光反应相关的基因在预测适合度方面总体上很重要。通过这项研究,我们证明了机器学习模型在小样本中的实用性及其在理解自然选择方面的潜在应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0356/11844753/e0b3615eff35/evae275f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验