Meguerditchian Caroline, Baux David, Ludwig Thomas E, Genin Emmanuelle, Trégouët David-Alexandre, Soukarieh Omar
Université de Bordeaux, INSERM, Bordeaux Population Health Research Center, UMR 1219, F-3000 Bordeaux, France.
Molecular Genetics Laboratory, Université de Montpellier, CHU Montpellier, F-34000 Montpellier, France.
NAR Genom Bioinform. 2025 Mar 19;7(1):lqaf017. doi: 10.1093/nargab/lqaf017. eCollection 2025 Mar.
Non-canonical small open reading frames (sORFs) are among the main regulators of gene expression. The most studied of these are upstream ORFs (upORFs) located in the 5'-untranslated region (UTR) of coding genes. Internal ORFs (intORFs) in the coding sequence and downstream ORFs (dORFs) in the 3'UTR have received less attention. Different bioinformatics tools permit the prediction of single nucleotide variants (SNVs) altering upORFs, mainly those creating AUGs or deleting stop codons, but no tool predicts variants altering non-canonical translation initiation sites and those altering intORFs or dORFs. We propose an upgrade of our MORFEE bioinformatics tool to identify SNVs that may alter all types of sORFs in coding transcripts from a VCF file. Moreover, we generate an exhaustive catalog, named MORFEEdb, reporting all possible SNVs altering existing upORFs or creating new ones in human transcripts, and provide an R script for visualizing the results. MORFEEdb has been implemented in the public platform Mobidetails. Finally, the annotation of ClinVar variants with MORFEE reveals that > 45% of UTR-SNVs can alter upORFs or dORFs. In conclusion, MORFEE and MORFEEdb have the potential to improve the molecular diagnosis of rare human diseases and to facilitate the identification of functional variants from genome-wide association studies of complex traits.
非规范小开放阅读框(sORFs)是基因表达的主要调节因子之一。其中研究最多的是位于编码基因5'非翻译区(UTR)的上游开放阅读框(upORFs)。编码序列中的内部开放阅读框(intORFs)和3'UTR中的下游开放阅读框(dORFs)受到的关注较少。不同的生物信息学工具可预测改变upORFs的单核苷酸变异(SNVs),主要是那些产生AUGs或删除终止密码子的变异,但没有工具能预测改变非规范翻译起始位点以及改变intORFs或dORFs的变异。我们提议对我们的MORFEE生物信息学工具进行升级,以从VCF文件中识别可能改变编码转录本中所有类型sORFs的SNVs。此外,我们生成了一个详尽的目录,名为MORFEEdb,报告所有可能改变人类转录本中现有upORFs或产生新upORFs的SNVs,并提供一个R脚本用于可视化结果。MORFEEdb已在公共平台Mobidetails中实现。最后,用MORFEE对ClinVar变异进行注释发现,超过45%的UTR-SNVs可改变upORFs或dORFs。总之,MORFEE和MORFEEdb有潜力改善罕见人类疾病的分子诊断,并促进从复杂性状的全基因组关联研究中识别功能变异。