Suppr超能文献

未鉴定的人类基因编码序列的预测。I. 通过对来自人类未成熟髓样细胞系KG-1的随机抽样cDNA克隆进行分析推导得到的40个新基因(KIAA0001-KIAA0040)的编码序列。

Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1.

作者信息

Nomura N, Miyajima N, Sazuka T, Tanaka A, Kawarabayasi Y, Sato S, Nagase T, Seki N, Ishikawa K, Tabata S

机构信息

Institute of Gerontology, Nippon Medical School, Kanagawa, Japan.

出版信息

DNA Res. 1994;1(1):27-35. doi: 10.1093/dnares/1.1.27.

Abstract

We established a protocol for the prediction of the coding sequences of unidentified human genes based on the double selection and sequence analysis of cDNA clones with inserts carrying unreported 5'-terminal sequences and with insert sizes corresponding to nearly full-length transcripts. By applying the protocol, cDNA clones with inserts longer than 2 kb were isolated from a cDNA library of human immature myeloid cell line KG-1, and the coding sequences of 40 new genes were predicted. A computer search of the sequences indicated that 20 genes contained sequences similar to known genes in the GenBank/EMBL databases. The sequences of the remaining 20 genes were entirely new, and characteristic protein motifs or domains were identified in 32 genes. Other sequence features noted were that the coding sequences of 23 genes were followed by relatively long stretches of 3'-untranslated sequences and that 5 genes contained repetitive sequences in their 3'-untranslated regions. The chromosomal location of these genes has been determined. By increasing the scale of the above analysis, the coding sequences of many unidentified genes can be predicted.

摘要

我们建立了一个基于双重筛选和对带有未报道的5'-末端序列插入片段且插入片段大小对应于近乎全长转录本的cDNA克隆进行序列分析的方案,用于预测未鉴定的人类基因的编码序列。通过应用该方案,从人类未成熟髓系细胞系KG-1的cDNA文库中分离出插入片段长于2 kb的cDNA克隆,并预测了40个新基因的编码序列。对这些序列进行计算机搜索表明,20个基因包含与GenBank/EMBL数据库中已知基因相似的序列。其余20个基因的序列是全新的,并且在32个基因中鉴定出了特征性蛋白质基序或结构域。其他注意到的序列特征是,23个基因的编码序列之后是相对较长的3'-非翻译序列片段,并且5个基因在其3'-非翻译区域包含重复序列。这些基因的染色体定位已经确定。通过扩大上述分析的规模,可以预测许多未鉴定基因的编码序列。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验