Li Wan-Chen, Huang Chien-Hao, Chen Chia-Ling, Chuang Yu-Chien, Tung Shu-Yun, Wang Ting-Fang
Taiwan International Graduate Program in Molecular and Cellular Biology, Academia Sinica, Taipei, 115 Taiwan.
Institute of Life Sciences, National Defense Medical Center, Taipei, 115 Taiwan.
Biotechnol Biofuels. 2017 Jul 3;10:170. doi: 10.1186/s13068-017-0825-x. eCollection 2017.
(, ) QM6a is a model fungus for a broad spectrum of physiological phenomena, including plant cell wall degradation, industrial production of enzymes, light responses, conidiation, sexual development, polyketide biosynthesis, and plant-fungal interactions. The genomes of QM6a and its high enzyme-producing mutants have been sequenced by second-generation-sequencing methods and are publicly available from the Joint Genome Institute. While these genome sequences have offered useful information for genomic and transcriptomic studies, their limitations and especially their short read lengths make them poorly suited for some particular biological problems, including assembly, genome-wide determination of chromosome architecture, and genetic modification or engineering.
We integrated Pacific Biosciences and Illumina sequencing platforms for the highest-quality genome assembly yet achieved, revealing seven telomere-to-telomere chromosomes (34,922,528 bp; 10877 genes) with 1630 newly predicted genes and >1.5 Mb of new sequences. Most new sequences are located on AT-rich blocks, including 7 centromeres, 14 subtelomeres, and 2329 interspersed AT-rich blocks. The seven QM6a centromeres separately consist of 24 conserved repeats and 37 putative centromere-encoded genes. These findings open up a new perspective for future centromere and chromosome architecture studies. Next, we demonstrate that sexual crossing readily induced cytosine-to-thymine point mutations on both tandem and unlinked duplicated sequences. We also show by bioinformatic analysis that has evolved a robust repeat-induced point mutation (RIP) system to accumulate AT-rich sequences, with longer AT-rich blocks having more RIP mutations. The widespread distribution of AT-rich blocks correlates genome-wide partitions with gene clusters, explaining why clustering of genes has been reported to not influence gene expression in .
Compartmentation of ancestral gene clusters by AT-rich blocks might promote flexibilities that are evolutionarily advantageous in this fungus' soil habitats and other natural environments. Our analyses, together with the complete genome sequence, provide a better blueprint for biotechnological and industrial applications.
QM6a是一种用于研究多种生理现象的模式真菌,这些生理现象包括植物细胞壁降解、酶的工业生产、光反应、分生孢子形成、有性发育、聚酮生物合成以及植物与真菌的相互作用。QM6a及其高产酶突变体的基因组已通过第二代测序方法进行了测序,可从联合基因组研究所公开获取。虽然这些基因组序列为基因组和转录组研究提供了有用信息,但其局限性,尤其是短读长,使其不太适合某些特定的生物学问题,包括组装、全基因组染色体结构测定以及基因改造或工程。
我们整合了太平洋生物科学公司和Illumina测序平台,实现了迄今为止最高质量的基因组组装,揭示了7条端粒到端粒的染色体(34,922,528 bp;10877个基因),有1630个新预测基因和超过1.5 Mb的新序列。大多数新序列位于富含AT的区域,包括7个着丝粒、14个亚端粒和2329个散布的富含AT的区域。七个QM6a着丝粒分别由24个保守重复序列和37个假定的着丝粒编码基因组成。这些发现为未来着丝粒和染色体结构研究开辟了新视角。接下来,我们证明有性杂交很容易在串联和非连锁重复序列上诱导胞嘧啶到胸腺嘧啶的点突变。我们还通过生物信息学分析表明,QM6a已经进化出一个强大的重复序列诱导点突变(RIP)系统来积累富含AT的序列,富含AT的区域越长,RIP突变越多。富含AT区域的广泛分布将全基因组分区与基因簇联系起来,解释了为什么据报道基因簇的聚集不会影响QM6a中的基因表达。
富含AT的区域对祖先基因簇的分隔可能促进了灵活性,这在这种真菌的土壤栖息地和其他自然环境中具有进化优势。我们的分析以及完整的基因组序列为生物技术和工业应用提供了更好的蓝图。