蛋白质基因组学在合轴马拉色菌的全基因组组装中产生了全面且高度准确的蛋白质编码基因注释。

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

作者信息

Zhu Yafeng, Engström Pär G, Tellgren-Roth Christian, Baudo Charles D, Kennell John C, Sun Sheng, Billmyre R Blake, Schröder Markus S, Andersson Anna, Holm Tina, Sigurgeirsson Benjamin, Wu Guangxi, Sankaranarayanan Sundar Ram, Siddharthan Rahul, Sanyal Kaustuv, Lundeberg Joakim, Nystedt Björn, Boekhout Teun, Dawson Thomas L, Heitman Joseph, Scheynius Annika, Lehtiö Janne

机构信息

Science for Life Laboratory, Department of Oncology-Pathology, Karolinska Institutet, 17121 Solna, Sweden.

Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, 17121 Solna, Sweden.

出版信息

Nucleic Acids Res. 2017 Mar 17;45(5):2629-2643. doi: 10.1093/nar/gkx006.

DOI:10.1093/nar/gkx006

PMID:28100699

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5389616/

Abstract

Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies.

摘要

完整且准确的基因组组装和注释是比较基因组学和功能基因组学的关键基础。尽管如此，完整的真核生物基因组却很少见，基因组注释仍然是一项重大挑战。在此，我们展示了皮肤共生酵母合轴马拉色菌的完整基因组组装，并证明了蛋白质基因组学如何能显著改善基因注释。通过长读长DNA测序，我们获得了合轴马拉色菌（ATCC 42132）的无间隙基因组组装，其包含八条核染色体和一条线粒体染色体。我们还对四株合轴马拉色菌临床分离株进行了测序和组装，并通过确认两个交配型位点的四种替代等位基因组合，展示了它们在理解马拉色菌繁殖方面的价值。重要的是，我们展示了蛋白质组学数据如何能在标准注释工具中轻松地与转录组学数据整合。与仅使用转录组学证据相比，这使得注释的蛋白质编码基因数量增加了14%（从3612个增加到4113个）。人工整理进一步使蛋白质编码基因数量增加了9%（达到4493个）。所有这些基因都有RNA测序证据，并且87%被蛋白质组学所证实。本文展示的合轴马拉色菌基因组组装和注释质量很高，仅少数真核生物达到了这种质量，并且构成了未来宿主 - 微生物相互作用研究的重要参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebb6/5389616/8e3eda1e6bb7/gkx006fig1.jpg

相似文献

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.蛋白质基因组学在合轴马拉色菌的全基因组组装中产生了全面且高度准确的蛋白质编码基因注释。

Nucleic Acids Res. 2017 Mar 17;45(5):2629-2643. doi: 10.1093/nar/gkx006.

Genomic insights into the atopic eczema-associated skin commensal yeast Malassezia sympodialis.对特应性皮炎相关皮肤共生酵母糠秕马拉色菌的基因组研究。

mBio. 2013 Jan 22;4(1):e00572-12. doi: 10.1128/mBio.00572-12.

Combination of Proteogenomics with Peptide Sequencing Identifies New Genes and Hidden Posttranscriptional Modifications.蛋白质基因组学与肽测序相结合，可鉴定新基因和隐藏的转录后修饰。

mBio. 2019 Oct 15;10(5):e02367-19. doi: 10.1128/mBio.02367-19.

Improving Silkworm Genome Annotation Using a Proteogenomics Approach.利用蛋白质基因组学方法提高家蚕基因组注释质量。

J Proteome Res. 2019 Aug 2;18(8):3009-3019. doi: 10.1021/acs.jproteome.8b00965. Epub 2019 Jul 2.

Peptimapper: proteogenomics workflow for the expert annotation of eukaryotic genomes.Peptimapper：用于真核生物基因组专家注释的蛋白质基因组学工作流程。

BMC Genomics. 2019 Jan 17;20(1):56. doi: 10.1186/s12864-019-5431-9.

CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.CodingQuarry：利用RNA测序转录本对真菌基因组进行高精度隐马尔可夫模型基因预测。

BMC Genomics. 2015 Mar 11;16(1):170. doi: 10.1186/s12864-015-1344-4.

Proteogenomic Methods to Improve Genome Annotation.用于改进基因组注释的蛋白质基因组学方法

Methods Mol Biol. 2016;1410:77-89. doi: 10.1007/978-1-4939-3524-6_5.

Improving GENCODE reference gene annotation using a high-stringency proteogenomics workflow.利用高严格性的蛋白质基因组学工作流程改进 GENCODE 参考基因注释。

Nat Commun. 2016 Jun 2;7:11778. doi: 10.1038/ncomms11778.

Enhancing Structural Annotation of Yeast Genomes with RNA-Seq Data.利用RNA测序数据增强酵母基因组的结构注释

Methods Mol Biol. 2016;1361:41-56. doi: 10.1007/978-1-4939-3079-1_2.

Proteogenomics: Recycling Public Data to Improve Genome Annotations.蛋白质基因组学：循环利用公共数据以改进基因组注释

Methods Enzymol. 2017;585:217-243. doi: 10.1016/bs.mie.2016.09.020. Epub 2016 Nov 29.

引用本文的文献

What are the 100 most cited fungal genera?被引用次数最多的100个真菌属有哪些？

Stud Mycol. 2024 Jul;108:1-411. doi: 10.3114/sim.2024.108.01. Epub 2024 Jul 15.

Malassezia responds to environmental pH signals through the conserved Rim/Pal pathway.马拉色菌通过保守的Rim/Pal途径对环境pH信号作出反应。

bioRxiv. 2024 Jul 11:2024.07.11.603086. doi: 10.1101/2024.07.11.603086.

Proteogenomic Gene Structure Validation in the Pineapple Genome.菠萝基因组中的蛋白质基因组基因结构验证

J Proteome Res. 2024 May 3;23(5):1583-1592. doi: 10.1021/acs.jproteome.3c00675. Epub 2024 Apr 23.

Frequent transitions in mating-type locus chromosomal organization in and early steps in sexual reproduction.交配型基因座染色体组织的频繁转变与有性生殖的早期步骤。

Proc Natl Acad Sci U S A. 2023 Aug 8;120(32):e2305094120. doi: 10.1073/pnas.2305094120. Epub 2023 Jul 31.

Co-evolution of large inverted repeats and G-quadruplex DNA in fungal mitochondria may facilitate mitogenome stability: the case of Malassezia.真菌线粒体中大的反向重复序列和 G-四链体 DNA 的共同进化可能有助于线粒体基因组的稳定性：以马拉色菌为例。

Sci Rep. 2023 Apr 18;13(1):6308. doi: 10.1038/s41598-023-33486-4.

Frequent transitions in mating-type locus chromosomal organization in and early steps in sexual reproduction.交配型基因座染色体组织的频繁转换和有性生殖的早期步骤。

bioRxiv. 2023 Jun 9:2023.03.25.534224. doi: 10.1101/2023.03.25.534224.

Multiple Hybridization Events Punctuate the Evolutionary Trajectory of .多种杂交事件在. 的进化轨迹中留下了印记。

mBio. 2022 Apr 26;13(2):e0385321. doi: 10.1128/mbio.03853-21. Epub 2022 Apr 11.

The evolving species concepts used for yeasts: from phenotypes and genomes to speciation networks.用于酵母的不断演变的物种概念：从表型和基因组到物种形成网络。

Fungal Divers. 2021;109(1):27-55. doi: 10.1007/s13225-021-00475-9. Epub 2021 Jun 26.

Comparative analysis of Malassezia furfur mitogenomes and the development of a mitochondria-based typing approach.糠秕马拉色菌线粒体基因组的比较分析及基于线粒体的分型方法的建立。

FEMS Yeast Res. 2021 Oct 12;21(7). doi: 10.1093/femsyr/foab051.

A Novel Mycovirus Evokes Transcriptional Rewiring in the Fungus and Stimulates Beta Interferon Production in Macrophages.一种新型真菌病毒引发真菌转录重排，并刺激巨噬细胞产生β干扰素。

mBio. 2020 Sep 1;11(5):e01534-20. doi: 10.1128/mBio.01534-20.

本文引用的文献

Proteogenomic Discovery of a Small, Novel Protein in Yeast Reveals a Strategy for the Detection of Unannotated Short Open Reading Frames.通过蛋白质基因组学在酵母中发现一种新型小蛋白，揭示了检测未注释短开放阅读框的策略。

J Proteome Res. 2015 Dec 4;14(12):5038-47. doi: 10.1021/acs.jproteome.5b00734. Epub 2015 Nov 10.

Genus-Wide Comparative Genomics of Malassezia Delineates Its Phylogeny, Physiology, and Niche Adaptation on Human Skin.马拉色菌属全基因组比较基因组学描绘了其在人体皮肤上的系统发育、生理学和生态位适应性。

PLoS Genet. 2015 Nov 5;11(11):e1005614. doi: 10.1371/journal.pgen.1005614. eCollection 2015 Nov.

Regional centromeres in the yeast Candida lusitaniae lack pericentromeric heterochromatin.葡萄牙念珠菌中的区域着丝粒缺乏着丝粒周围的异染色质。

Proc Natl Acad Sci U S A. 2015 Sep 29;112(39):12139-44. doi: 10.1073/pnas.1508749112. Epub 2015 Sep 14.

Single-Molecule Real-Time Sequencing Combined with Optical Mapping Yields Completely Finished Fungal Genome.单分子实时测序结合光学图谱生成完全完成的真菌基因组

mBio. 2015 Aug 18;6(4):e00936-15. doi: 10.1128/mBio.00936-15.

The diversity of fungal genome.真菌基因组的多样性。

Biol Proced Online. 2015 Apr 2;17:8. doi: 10.1186/s12575-015-0020-z. eCollection 2015.

Proteogenomics: concepts, applications and computational strategies.蛋白质基因组学：概念、应用及计算策略

Nat Methods. 2014 Nov;11(11):1114-25. doi: 10.1038/nmeth.3144.

Biogeography and individuality shape function in the human skin metagenome.生物地理学和个体性塑造了人类皮肤宏基因组的功能。

Nature. 2014 Oct 2;514(7520):59-64. doi: 10.1038/nature13786.

Analysis of stranded information using an automated procedure for strand specific RNA sequencing.使用针对链特异性RNA测序的自动化程序分析链特异性信息。

BMC Genomics. 2014 Jul 28;15(1):631. doi: 10.1186/1471-2164-15-631.

Analysis of the genome and transcriptome of Cryptococcus neoformans var. grubii reveals complex RNA expression and microevolution leading to virulence attenuation.新型隐球菌格鲁比变种的基因组和转录组分析揭示了导致毒力减弱的复杂RNA表达和微观进化。

PLoS Genet. 2014 Apr 17;10(4):e1004261. doi: 10.1371/journal.pgen.1004261. eCollection 2014 Apr.

HiRIEF LC-MS enables deep proteome coverage and unbiased proteogenomics.HiRIEF LC-MS 可实现深度蛋白质组覆盖和无偏蛋白质基因组学分析。

Nat Methods. 2014 Jan;11(1):59-62. doi: 10.1038/nmeth.2732. Epub 2013 Nov 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

蛋白质基因组学在合轴马拉色菌的全基因组组装中产生了全面且高度准确的蛋白质编码基因注释。

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献