Suppr超能文献

使用循环神经网络检测引发稻瘟病的真菌菌株中的额外染色体。

Using recurrent neural networks to detect supernumerary chromosomes in fungal strains causing blast diseases.

作者信息

Gyawali Nikesh, Hao Yangfan, Lin Guifang, Huang Jun, Bika Ravi, Daza Lidia Calderon, Zheng Huakun, Cruppe Giovana, Caragea Doina, Cook David, Valent Barbara, Liu Sanzhen

机构信息

Department of Computer Science, Kansas State University, Manhattan, KS 66506, USA.

Department of Plant Pathology, Kansas State University, Manhattan, KS 66506, USA.

出版信息

NAR Genom Bioinform. 2024 Aug 20;6(3):lqae108. doi: 10.1093/nargab/lqae108. eCollection 2024 Sep.

Abstract

The genomes of the fungus that causes blast diseases on diverse grass species, including major crops, have indispensable core-chromosomes and may contain supernumerary chromosomes, also known as mini-chromosomes. These mini-chromosomes are speculated to provide effector gene mobility, and may transfer between strains. To understand the biology of mini-chromosomes, it is valuable to be able to detect whether a strain possesses a mini-chromosome. Here, we applied recurrent neural network models for classifying DNA sequences as arising from core- or mini-chromosomes. The models were trained with sequences from available core- and mini-chromosome assemblies, and then used to predict the presence of mini-chromosomes in a global collection of isolates using short-read DNA sequences. The model predicted that mini-chromosomes were prevalent in . isolates. Interestingly, at least one mini-chromosome was present in all recent wheat isolates, but no mini-chromosomes were found in early isolates collected before 1991, indicating a preferential selection for strains carrying mini-chromosomes in recent years. The model was also used to identify assembled contigs derived from mini-chromosomes. In summary, our study has developed a reliable method for categorizing DNA sequences and showcases an application of recurrent neural networks in predictive genomics.

摘要

在包括主要农作物在内的多种禾本科植物上引发稻瘟病的真菌基因组,具有不可或缺的核心染色体,并且可能包含超数染色体,也称为小染色体。据推测,这些小染色体能够提供效应基因的移动性,并且可能在菌株之间转移。为了了解小染色体的生物学特性,能够检测一个菌株是否拥有小染色体是很有价值的。在此,我们应用循环神经网络模型将DNA序列分类为源自核心染色体或小染色体。这些模型使用来自可用的核心染色体和小染色体组装序列进行训练,然后用于利用短读长DNA序列预测全球分离株集合中小染色体的存在情况。该模型预测小染色体在分离株中普遍存在。有趣的是,在所有近期的小麦分离株中至少存在一条小染色体,但在1991年之前收集的早期分离株中未发现小染色体,这表明近年来对携带小染色体的菌株存在优先选择。该模型还用于识别源自小染色体的组装重叠群。总之,我们的研究开发了一种可靠的DNA序列分类方法,并展示了循环神经网络在预测基因组学中的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7849/11333962/ad18965af14f/lqae108fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验