Wang Maojun, Yuan Daojun, Tu Lili, Gao Wenhui, He Yonghui, Hu Haiyan, Wang Pengcheng, Liu Nian, Lindsey Keith, Zhang Xianlong
National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan, 430070, China.
Integrative Cell Biology Laboratory, School of Biological and Biomedical Sciences, Durham University, South Road, Durham, DH1 3LE, UK.
New Phytol. 2015 Sep;207(4):1181-97. doi: 10.1111/nph.13429. Epub 2015 Apr 28.
Long noncoding RNAs (lncRNAs) are transcripts of at least 200 bp in length, possess no apparent coding capacity and are involved in various biological regulatory processes. Until now, no systematic identification of lncRNAs has been reported in cotton (Gossypium spp.). Here, we describe the identification of 30 550 long intergenic noncoding RNA (lincRNA) loci (50 566 transcripts) and 4718 long noncoding natural antisense transcript (lncNAT) loci (5826 transcripts). LncRNAs are rich in repetitive sequences and preferentially expressed in a tissue-specific manner. The detection of abundant genome-specific and/or lineage-specific lncRNAs indicated their weak evolutionary conservation. Approximately 76% of homoeologous lncRNAs exhibit biased expression patterns towards the At or Dt subgenomes. Compared with protein-coding genes, lncRNAs showed overall higher methylation levels and their expression was less affected by gene body methylation. Expression validation in different cotton accessions and coexpression network construction helped to identify several functional lncRNA candidates involved in cotton fibre initiation and elongation. Analysis of integrated expression from the subgenomes of lncRNAs generating miR397 and its targets as a result of genome polyploidization indicated their pivotal functions in regulating lignin metabolism in domesticated tetraploid cotton fibres. This study provides the first comprehensive identification of lncRNAs in Gossypium.
长链非编码RNA(lncRNAs)是长度至少为200bp的转录本,不具备明显的编码能力,并参与各种生物调控过程。到目前为止,尚未有关于棉花(棉属物种)lncRNAs的系统鉴定报道。在此,我们描述了30550个长基因间非编码RNA(lincRNA)位点(50566个转录本)和4718个长链非编码天然反义转录本(lncNAT)位点(5826个转录本)的鉴定。LncRNAs富含重复序列,并优先以组织特异性方式表达。大量基因组特异性和/或谱系特异性lncRNAs的检测表明它们的进化保守性较弱。大约76%的同源lncRNAs对At或Dt亚基因组表现出偏向性表达模式。与蛋白质编码基因相比,lncRNAs总体上表现出更高的甲基化水平,并且它们的表达受基因体甲基化的影响较小。在不同棉花种质中的表达验证和共表达网络构建有助于鉴定出几个参与棉花纤维起始和伸长的功能性lncRNA候选物。对lncRNAs亚基因组的整合表达分析表明,由于基因组多倍化,miR397及其靶标在驯化四倍体棉花纤维中调节木质素代谢方面具有关键作用。本研究首次对棉属中的lncRNAs进行了全面鉴定。