Centre for Developmental Neurobiology, King's College London, London, UK.
Institute of Computer Science, University of Tartu, Tartu, Estonia.
Methods Mol Biol. 2022;2537:149-172. doi: 10.1007/978-1-0716-2521-7_9.
Many eukaryotic genes can give rise to different alternative transcripts depending on stage of development, cell type, and physiological cues. Current transcriptome-wide sequencing technologies highlight the remarkable extent of this regulation in metazoans and allow for RNA isoforms to be profiled in increasingly small biological samples and with a growing confidence. Understanding biological functions of sample-specific transcripts is a major challenge in genomics and RNA processing fields. Here we describe simple bioinformatics workflows that facilitate this task by streamlining reference-guided annotation of novel transcripts. A key part of our protocol is the R package factR that rapidly matches custom-assembled transcripts to their likely host genes, deduces the sequence and domain structure of novel protein products, and predicts sensitivity of newly identified RNA isoforms to nonsense-mediated decay.
许多真核基因可以根据发育阶段、细胞类型和生理信号产生不同的选择性转录本。当前的转录组全序列测序技术突出了这种调控在后生动物中的显著程度,并允许在越来越小的生物样本中对 RNA 异构体进行分析,并越来越有信心。了解特定样本的转录本的生物学功能是基因组学和 RNA 处理领域的主要挑战。在这里,我们描述了简单的生物信息学工作流程,通过简化参考指导的新型转录本注释来促进这项任务。我们方案的一个关键部分是 R 包 factR,它可以快速将自定义组装的转录本与其可能的宿主基因匹配,推断新型蛋白质产物的序列和结构域结构,并预测新鉴定的 RNA 异构体对无义介导的衰变的敏感性。