Suppr超能文献

通过深入的表观遗传学和转录组分析对模式纤毛虫嗜热四膜虫进行全面的基因组注释。

Comprehensive genome annotation of the model ciliate Tetrahymena thermophila by in-depth epigenetic and transcriptomic profiling.

作者信息

Ye Fei, Chen Xiao, Li Yuan, Ju Aili, Sheng Yalan, Duan Lili, Zhang Jiachen, Zhang Zhe, Al-Rasheid Khaled A S, Stover Naomi A, Gao Shan

机构信息

MOE Key Laboratory of Evolution & Marine Biodiversity and Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao 266003, China.

Laboratory for Marine Biology and Biotechnology, Qingdao Marine Science and Technology Center, Qingdao 266237, China.

出版信息

Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkae1177.

Abstract

The ciliate Tetrahymena thermophila is a well-established unicellular model eukaryote, contributing significantly to foundational biological discoveries. Despite its acknowledged importance, current studies on Tetrahymena biology face challenges due to gene annotation inaccuracy, particularly the notable absence of untranslated regions (UTRs). To comprehensively annotate the Tetrahymena macronuclear genome, we collected extensive transcriptomic data spanning various cell stages. To ascertain transcript orientation and transcription start/end sites, we incorporated data on epigenetic marks displaying enrichment towards the 5' end of gene bodies, including H3 lysine 4 tri-methylation (H3K4me3), histone variant H2A.Z, nucleosome positioning and N6-methyldeoxyadenine (6mA). Cap-seq data was subsequently applied to validate the accuracy of identified transcription start sites. Additionally, we integrated Nanopore direct RNA sequencing (DRS), strand-specific RNA sequencing (RNA-seq) and assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) data. Using a newly developed bioinformatic pipeline, coupled with manual curation and experimental validation, our work yielded substantial improvements to the current gene models, including the addition of 2,481 new genes, updates to 23,936 existing genes, and the incorporation of 8,339 alternatively spliced isoforms. Furthermore, novel UTR information was annotated for 26,687 high-confidence genes. Intriguingly, 20% of protein-coding genes were identified to have natural antisense transcripts characterized by high diversity in alternative splicing, thus offering insights into understanding transcriptional regulation. Our work will enhance the utility of Tetrahymena as a robust genetic toolkit for advancing biological research, and provides a promising framework for genome annotation in other eukaryotes.

摘要

嗜热四膜虫是一种成熟的单细胞模式真核生物,对基础生物学发现做出了重大贡献。尽管其重要性已得到认可,但由于基因注释不准确,尤其是明显缺乏非翻译区(UTR),目前关于四膜虫生物学的研究面临挑战。为了全面注释四膜虫的大核基因组,我们收集了涵盖不同细胞阶段的广泛转录组数据。为了确定转录方向和转录起始/终止位点,我们纳入了在基因体5'端显示富集的表观遗传标记数据,包括组蛋白H3赖氨酸4三甲基化(H3K4me3)、组蛋白变体H2A.Z、核小体定位和N6-甲基脱氧腺嘌呤(6mA)。随后应用帽分析基因表达测序(Cap-seq)数据来验证所确定转录起始位点的准确性。此外,我们整合了纳米孔直接RNA测序(DRS)、链特异性RNA测序(RNA-seq)以及转座酶可及染色质高通量测序分析(ATAC-seq)数据。通过使用新开发的生物信息学流程,并结合人工整理和实验验证,我们的工作对当前的基因模型有了实质性改进,包括新增2481个新基因、更新23936个现有基因以及纳入8339个可变剪接异构体。此外,还为26687个高可信度基因注释了新的UTR信息。有趣的是,20%的蛋白质编码基因被鉴定为具有天然反义转录本,其特点是可变剪接高度多样,从而为理解转录调控提供了思路。我们的工作将提高四膜虫作为推进生物学研究的强大遗传工具的实用性,并为其他真核生物的基因组注释提供一个有前景的框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6517/11754650/2aea98590b2e/gkae1177figgra1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验