Department of Dairy Science, University of Wisconsin-Madison, Madison, WI 53706, USA.
Anim Genet. 2012 Dec;43(6):674-82. doi: 10.1111/j.1365-2052.2012.02325.x. Epub 2012 Feb 8.
It has become increasingly clear that the mammalian genomes produce many long non-coding RNAs (lncRNAs). Accumulating evidence suggests important functions for lncRNAs in a variety of biological processes. However, little is known about lncRNA identity and characteristics in cattle. Using public bovine-specific expressed sequence tags sequences, we reconstructed transcript assemblies, from which reference sequences were obtained for RNAs. Intergenic regions with evidence of transcription were screened for putative lncRNAs using the combination of a gene-finding program and a support vector machine-based tool for the calculation of protein-coding potential. A total of 449 putative lncRNAs located in 405 intergenic regions were identified. Characterization of these putative bovine lncRNAs suggests that they are generally expressed in a tissue-specific manner, their GC contents are higher than randomly selected intergenic sequences but are lower than protein-coding genes, and they are moderately conserved among mammals. This is the first genome-wide catalogue of putative intergenic lncRNAs in cattle and provides important targets for functional studies.
越来越明显的是,哺乳动物基因组产生了许多长非编码 RNA(lncRNA)。越来越多的证据表明,lncRNA 在多种生物过程中具有重要功能。然而,关于牛的 lncRNA 身份和特征知之甚少。使用公共牛特异性表达序列标签序列,我们重建了转录物组装,从中获得了 RNA 的参考序列。使用基因发现程序和基于支持向量机的工具组合,对具有转录证据的基因间区进行筛选,以寻找推定的 lncRNA,用于计算编码蛋白的潜力。总共鉴定出 449 个位于 405 个基因间区的推定牛 lncRNA。对这些推定的牛 lncRNA 的特征分析表明,它们通常以组织特异性方式表达,其 GC 含量高于随机选择的基因间序列,但低于编码蛋白的基因,并且在哺乳动物中具有中等保守性。这是牛中第一个全基因组推定基因间 lncRNA 目录,为功能研究提供了重要目标。