Burger G, Werner S
J Mol Biol. 1986 Oct 20;191(4):589-99. doi: 10.1016/0022-2836(86)90447-x.
The mitochondrial DNA of Neurospora crassa contains a long potential gene, designated URFN, which is located immediately downstream from the CO1 gene. These two genes are encoded in different reading frames and overlap by 13 codons. URFN is 633 triplets long and terminates at a UAG stop codon. Its codon usage is atypical for N. crassa mitochondrial exons and introns, and resembles that of the long open reading frame (ORF) of the mitochondrial plasmid present in N. crassa strain Mauriceville. Multiple sequence repetitions occur in the presumptive URFN polypeptide, most notably a seven-times reiterated motif of 16 to 18 amino acid residues length. The hydropathy pattern shows that the N-terminal third of the URFN polypeptide is predominantly apolar and includes several potentially membrane-spanning stretches; the remaining part is hydrophilic. Calculation of the secondary structure predicts a high proportion (47%) of alpha-helix conformation. The longest alpha-helix contains 40 residues. No similarities to other mitochondrial genes or reading frames have been found, except a significant homology over a stretch of 16 amino acid residues between the N-terminal part of URFN and a well-conserved sequence in the C-terminal region of CO1. The repetitive region in URFN resembles a similarly repetitive stretch in an unassigned reading frame from bacteriophage lambda. Three arguments support the view that URFN is translated. The open reading frame has a considerable length; URFN is transcribed into a mRNA including the overlapping CO1 gene; URFN is most probably conserved among all the various Neurospora species examined thus far, strongly suggesting that it codes for an essential protein.
粗糙脉孢菌的线粒体DNA包含一个长的潜在基因,命名为URFN,它位于CO1基因的紧邻下游。这两个基因以不同的阅读框编码,并且重叠13个密码子。URFN长633个三联体,终止于一个UAG终止密码子。其密码子使用情况对于粗糙脉孢菌线粒体外显子和内含子而言是非典型的,并且类似于存在于粗糙脉孢菌莫里斯维尔菌株中的线粒体质粒的长开放阅读框(ORF)。在假定的URFN多肽中出现多个序列重复,最显著的是一个长度为16至18个氨基酸残基的七次重复基序。亲水性图谱显示,URFN多肽的N端三分之一主要是非极性的,并且包括几个潜在的跨膜区域;其余部分是亲水性的。二级结构计算预测α-螺旋构象的比例很高(47%)。最长的α-螺旋包含40个残基。除了URFN N端部分与CO1 C端区域的一段16个氨基酸残基的高度同源性外,未发现与其他线粒体基因或阅读框有相似之处。URFN中的重复区域类似于来自噬菌体λ的一个未指定阅读框中的类似重复片段。有三个论据支持URFN被翻译的观点。开放阅读框有相当的长度;URFN被转录成包含重叠的CO1基因的mRNA;URFN在迄今为止所检测的所有不同粗糙脉孢菌物种中很可能是保守的,强烈表明它编码一种必需蛋白质。