Suppr超能文献

用于预测植物组织身份的基于表达的机器学习模型。

Expression-based machine learning models for predicting plant tissue identity.

作者信息

Palande Sourabh, Arsenault Jeremy, Basurto-Lozada Patricia, Bleich Andrew, Brown Brianna N I, Buysse Sophia F, Connors Noelle A, Das Adhikari Sikta, Dobson Kara C, Guerra-Castillo Francisco Xavier, Guerrero-Carrillo Maria F, Harlow Sophia, Herrera-Orozco Héctor, Hightower Asia T, Izquierdo Paulo, Jacobs MacKenzie, Johnson Nicholas A, Leuenberger Wendy, Lopez-Hernandez Alessandro, Luckie-Duque Alicia, Martínez-Avila Camila, Mendoza-Galindo Eddy J, Plancarte David Cruz, Schuster Jenny M, Shomer Harry, Sitar Sidney C, Steensma Anne K, Thomson Joanne Elise, Villaseñor-Amador Damián, Waterman Robin, Webster Brandon M, Whyte Madison, Zorilla-Azcué Sofía, Montgomery Beronda L, Husbands Aman Y, Krishnan Arjun, Percival Sarah, Munch Elizabeth, VanBuren Robert, Chitwood Daniel H, Rougon-Cardoso Alejandra

机构信息

Department of Computational Mathematics, Science and Engineering Michigan State University East Lansing Michigan USA.

Department of Computer Science and Engineering Michigan State University East Lansing Michigan USA.

出版信息

Appl Plant Sci. 2024 Oct 19;13(1):e11621. doi: 10.1002/aps3.11621. eCollection 2025 Jan-Feb.

Abstract

PREMISE

The selection of as a model organism played a pivotal role in advancing genomic science. The competing frameworks to select an agricultural- or ecological-based model species were rejected, in favor of building knowledge in a species that would facilitate genome-enabled research.

METHODS

Here, we examine the ability of models based on gene expression data to predict tissue identity in other flowering plants. Comparing different machine learning algorithms, models trained and tested on data achieved near perfect precision and recall values, whereas when tissue identity is predicted across the flowering plants using models trained on data, precision values range from 0.69 to 0.74 and recall from 0.54 to 0.64.

RESULTS

The identity of belowground tissue can be predicted more accurately than other tissue types, and the ability to predict tissue identity is not correlated with phylogenetic distance from . -nearest neighbors is the most successful algorithm, suggesting that gene expression signatures, rather than marker genes, are more valuable to create models for tissue and cell type prediction in plants.

DISCUSSION

Our data-driven results highlight that the assertion that knowledge from is translatable to other plants is not always true. Considering the current landscape of abundant sequencing data, we should reevaluate the scientific emphasis on and prioritize plant diversity.

摘要

前提

选择[具体物种]作为模式生物在推进基因组科学方面发挥了关键作用。选择基于农业或生态的模式物种的竞争框架被否决,转而支持在一个有助于开展基因组研究的物种中积累知识。

方法

在此,我们研究基于[具体物种]基因表达数据的模型预测其他开花植物组织身份的能力。比较不同的机器学习算法,在[具体物种]数据上训练和测试的模型实现了近乎完美的精确率和召回率值,而当使用在[具体物种]数据上训练的模型预测整个开花植物的组织身份时,精确率值范围为0.69至0.74,召回率为0.54至0.64。

结果

地下组织的身份比其他组织类型能更准确地被预测,并且预测组织身份的能力与与[具体物种]的系统发育距离无关。k近邻算法是最成功的算法,这表明基因表达特征而非标记基因对于创建植物组织和细胞类型预测模型更有价值。

讨论

我们的数据驱动结果凸显了认为来自[具体物种]的知识可转化到其他植物的观点并非总是正确的。考虑到当前丰富测序数据的情况,我们应该重新评估对[具体物种]的科学重视程度,并优先考虑植物多样性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5499/11788907/6521567c321b/APS3-13-e11621-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验