Fan Xiaojuan, Chang Tiangen, Chen Chuyun, Hafner Markus, Wang Zefeng
Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, 200031, China.
RNA Molecular Biology Laboratory, National Institute of Arthritis and Musculoskeletal and Skin Disease, Bethesda, MD 20814, United States.
Nucleic Acids Res. 2025 Apr 10;53(7). doi: 10.1093/nar/gkaf277.
Accurate annotation of coding regions in RNAs is essential for understanding gene translation. We developed a deep neural network to directly predict and analyze translation initiation and termination sites from RNA sequences. Trained with human transcripts, our model learned hidden rules of translation control and achieved a near perfect prediction of canonical translation sites across entire human transcriptome. Surprisingly, this model revealed a new role of codon usage in regulating translation termination, which was experimentally validated. We also identified thousands of new open reading frames in mRNAs or lncRNAs, some of which were confirmed experimentally. The model trained with human mRNAs achieved high prediction accuracy of canonical translation sites in all eukaryotes and good prediction in polycistronic transcripts from prokaryotes or RNA viruses, suggesting a high degree of conservation in translation control. Collectively, we present TranslationAI (https://www.biosino.org/TranslationAI/), a general and efficient deep learning model for RNA translation that generates new insights into the complexity of translation regulation.
准确注释RNA中的编码区域对于理解基因翻译至关重要。我们开发了一种深度神经网络,用于直接从RNA序列预测和分析翻译起始和终止位点。通过人类转录本进行训练,我们的模型学习到了翻译控制的隐藏规则,并在整个人类转录组中对经典翻译位点实现了近乎完美的预测。令人惊讶的是,该模型揭示了密码子使用在调节翻译终止中的新作用,这一作用已通过实验验证。我们还在mRNA或lncRNA中鉴定出数千个新的开放阅读框,其中一些已通过实验得到证实。用人类mRNA训练的模型在所有真核生物中对经典翻译位点都具有很高的预测准确性,并且对来自原核生物或RNA病毒的多顺反子转录本也有良好的预测,这表明翻译控制具有高度的保守性。我们共同推出了TranslationAI(https://www.biosino.org/TranslationAI/),这是一个用于RNA翻译的通用且高效的深度学习模型,它为翻译调控的复杂性带来了新的见解。