Suppr超能文献

一种用于识别番茄黄化曲叶病毒严重程度的有效集成机器学习框架及其实验验证。

An Effective Integrated Machine Learning Framework for Identifying Severity of Tomato Yellow Leaf Curl Virus and Their Experimental Validation.

作者信息

Bupi Nattanong, Sangaraju Vinoth Kumar, Phan Le Thi, Lal Aamir, Vo Thuy Thi Bich, Ho Phuong Thi, Qureshi Muhammad Amir, Tabassum Marjia, Lee Sukchan, Manavalan Balachandran

机构信息

Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.

Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.

出版信息

Research (Wash D C). 2023;6:0016. doi: 10.34133/research.0016. Epub 2023 Jan 10.

Abstract

Tomato yellow leaf curl virus (TYLCV) dispersed across different countries, specifically to subtropical regions, associated with more severe symptoms. Since TYLCV was first isolated in 1931, it has been a menace to tomato industrial production worldwide over the past century. Three groups were newly isolated from TYLCV-resistant tomatoes in 2022; however, their functions are unknown. The development of machine learning (ML)-based models using characterized sequences and evaluating blind predictions is one of the major challenges in interdisciplinary research. The purpose of this study was to develop an integrated computational framework for the accurate identification of symptoms (mild or severe) based on TYLCV sequences (isolated in Korea). For the development of the framework, we first extracted 11 different feature encodings and hybrid features from the training data and then explored 8 different classifiers and developed their respective prediction models by using randomized 10-fold cross-validation. Subsequently, we carried out a systematic evaluation of these 96 developed models and selected the top 90 models, whose predicted class labels were combined and considered as reduced features. On the basis of these features, a multilayer perceptron was applied and developed the final prediction model (IML-TYLCVs). We conducted blind prediction on 3 groups using IML-TYLCVs, and the results indicated that 2 groups were severe and 1 group was mild. Furthermore, we confirmed the prediction with virus-challenging experiments of tomato plant phenotypes using infectious clones from 3 groups. Plant virologists and plant breeding professionals can access the user-friendly online IML-TYLCVs web server at https://balalab-skku.org/IML-TYLCVs, which can guide them in developing new protection strategies for newly emerging viruses.

摘要

番茄黄化曲叶病毒(TYLCV)传播到不同国家,特别是亚热带地区,引发更严重的症状。自1931年首次分离出TYLCV以来,在过去的一个世纪里,它一直是全球番茄产业生产的一大威胁。2022年从抗TYLCV的番茄中 newly isolated 出三组;然而,它们的功能尚不清楚。利用已表征的序列开发基于机器学习(ML)的模型并评估盲预测是跨学科研究的主要挑战之一。本研究的目的是开发一个综合计算框架,用于基于TYLCV序列(在韩国分离)准确识别症状(轻度或重度)。为了开发该框架,我们首先从训练数据中提取了11种不同的特征编码和混合特征,然后探索了8种不同的分类器,并通过随机10折交叉验证开发了它们各自的预测模型。随后,我们对这96个开发的模型进行了系统评估,选择了前90个模型,其预测的类别标签被组合并视为简化特征。基于这些特征,应用多层感知器并开发了最终预测模型(IML-TYLCVs)。我们使用IML-TYLCVs对3组进行了盲预测,结果表明2组为重度,1组为轻度。此外,我们通过使用来自3组的感染性克隆对番茄植株表型进行病毒挑战实验来确认预测。植物病毒学家和植物育种专业人员可以访问用户友好的在线IML-TYLCVs网络服务器,网址为https://balalab-skku.org/IML-TYLCVs,它可以指导他们为新出现的病毒制定新的保护策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a469/10013792/29b748e93bd2/research.0016.fig.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验