Suppr超能文献

TISCalling:利用机器学习识别植物和病毒中的翻译起始位点。

TISCalling: leveraging machine learning to identify translational initiation sites in plants and viruses.

作者信息

Yen Ming-Ren, Li Ya-Ru, Cheng Chia-Yi, Wu Ting-Ying, Liu Ming-Jung

机构信息

Institute of Plant and Microbial Biology, Academia Sinica, Taipei, 115201, Taiwan.

Biotechnology Center in Southern Taiwan, Academia Sinica, Tainan, 711, Taiwan.

出版信息

Plant Mol Biol. 2025 Aug 1;115(4):102. doi: 10.1007/s11103-025-01632-3.

Abstract

The recognition of translational initiation sites (TISs) offers complementary insights into identifying genes encoding novel proteins or small peptides. Conventional computational methods primarily identify Ribo-seq-supported TISs and lack the capacity of systematic and global identification of TIS, especially for non-AUG sites in plants. Additionally, these methods are often unsuitable for evaluating the importance of mRNA sequence features for TIS determination. In this study, we present TISCalling, a robust framework that combines machine learning (ML) models and statistical analysis to identify and rank novel TISs across eukaryotes. TISCalling generalized and ranks important features common to multiple plant and mammalian species while identifying kingdom-specific features such as mRNA secondary structures and "G"-nucleotide contents. Furthermore, TISCalling achieved high predictive power for identifying novel viral TISs. Importantly, TISCalling provides prediction scores for putative TIS along plant transcripts, enabling prioritization of those of interest for further validation. We offer TISCalling as a command-line-based package [ https://github.com/yenmr/TISCalling ], capable of generating prediction models and identifying key sequence features. Additionally, we provide web tools [ https://predict.southerngenomics.org/TISCalling/ ] for visualizing pre-computed potential TISs, making it accessible to users without programming experience. The TISCalling framework offers a sequence-aware and interpretable approach for decoding genome sequences and exploring functional proteins in plants and viruses.

摘要

翻译起始位点(TISs)的识别为鉴定编码新蛋白质或小肽的基因提供了补充性见解。传统的计算方法主要识别核糖体测序(Ribo-seq)支持的TISs,缺乏对TIS进行系统和全局识别的能力,尤其是对于植物中的非AUG位点。此外,这些方法通常不适用于评估mRNA序列特征对TIS确定的重要性。在本研究中,我们提出了TISCalling,这是一个强大的框架,它结合了机器学习(ML)模型和统计分析,以识别和排列真核生物中的新TISs。TISCalling概括并排列了多种植物和哺乳动物物种共有的重要特征,同时识别了特定于不同生物界的特征,如mRNA二级结构和“G”核苷酸含量。此外,TISCalling在识别新的病毒TISs方面具有很高的预测能力。重要的是,TISCalling为植物转录本上的推定TIS提供预测分数,从而能够对感兴趣的TIS进行优先级排序,以便进一步验证。我们将TISCalling作为一个基于命令行的软件包[https://github.com/yenmr/TISCalling]提供,它能够生成预测模型并识别关键序列特征。此外,我们还提供了网络工具[https://predict.southerngenomics.org/TISCalling/],用于可视化预先计算的潜在TISs,使没有编程经验的用户也能使用。TISCalling框架提供了一种序列感知且可解释的方法,用于解码基因组序列并探索植物和病毒中的功能蛋白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/621e/12316744/371f005358c5/11103_2025_1632_Fig1_HTML.jpg

相似文献

1
TISCalling: leveraging machine learning to identify translational initiation sites in plants and viruses.
Plant Mol Biol. 2025 Aug 1;115(4):102. doi: 10.1007/s11103-025-01632-3.
2
SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.
J Bioinform Comput Biol. 2024 Oct;22(5):2450022. doi: 10.1142/S0219720024500227. Epub 2024 Oct 1.
4
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
6
TITER: predicting translation initiation sites by deep learning.
Bioinformatics. 2017 Jul 15;33(14):i234-i242. doi: 10.1093/bioinformatics/btx247.
7
8
ShortStop: a machine learning framework for microprotein discovery.
BMC Methods. 2025;2(1):16. doi: 10.1186/s44330-025-00037-4. Epub 2025 Aug 1.
10
ToxinPred 3.0: An improved method for predicting the toxicity of peptides.
Comput Biol Med. 2024 Sep;179:108926. doi: 10.1016/j.compbiomed.2024.108926. Epub 2024 Jul 21.

本文引用的文献

1
Principles, challenges, and advances in ribosome profiling: from bulk to low-input and single-cell analysis.
Adv Biotechnol (Singap). 2023 Dec 1;1(4):6. doi: 10.1007/s44307-023-00006-4.
4
Translation initiation at AUG and non-AUG triplets in plants.
Plant Sci. 2023 Oct;335:111822. doi: 10.1016/j.plantsci.2023.111822. Epub 2023 Aug 14.
5
What, where, and how: Regulation of translation and the translational landscape in plants.
Plant Cell. 2024 May 1;36(5):1540-1564. doi: 10.1093/plcell/koad197.
6
Unveiling the secrets of non-coding RNA-encoded peptides in plants: A comprehensive review of mining methods and research progress.
Int J Biol Macromol. 2023 Jul 1;242(Pt 3):124952. doi: 10.1016/j.ijbiomac.2023.124952. Epub 2023 May 29.
7
JBrowse 2: a modular genome browser with views of synteny and structural variation.
Genome Biol. 2023 Apr 17;24(1):74. doi: 10.1186/s13059-023-02914-z.
8
Pervasive translation of small open reading frames in plant long non-coding RNAs.
Front Plant Sci. 2022 Oct 24;13:975938. doi: 10.3389/fpls.2022.975938. eCollection 2022.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验