Makalowski W, Boguski M S
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Proc Natl Acad Sci U S A. 1998 Aug 4;95(16):9407-12. doi: 10.1073/pnas.95.16.9407.
We have rigorously defined 2,820 orthologous mRNA and protein sequence pairs from rats, mice, and humans. Evolutionary rate analyses indicate that mammalian genes are evolving 17-30% more slowly than previous textbook values. Data are presented on the average properties of mRNA and protein sequences, on variations in sequence conservation in coding and noncoding regions, and on the absolute and relative frequencies of repetitive elements and splice sites in untranslated regions of mRNAs. Our data set contains 1,880 unique human/rodent sequence pairs that represent about 2-4% of all mammalian genes. Of the 1,880 human orthologs, 70% are present on a new gene map of the human genome, thus providing a valuable resource for cross-referencing human and rodent genomes. In addition to comparative mapping, these results have practical applications in the interpretation of noncoding sequence conservation between syntenic regions of human and mouse genomic sequence, and in the design and calibration of gene expression arrays.
我们已经严格定义了来自大鼠、小鼠和人类的2820对直系同源mRNA和蛋白质序列。进化速率分析表明,哺乳动物基因的进化速度比以往教科书上的值慢17% - 30%。文中给出了mRNA和蛋白质序列的平均特性、编码区和非编码区序列保守性的变化,以及mRNA非翻译区中重复元件和剪接位点的绝对和相对频率的数据。我们的数据集包含1880个独特的人类/啮齿动物序列对,约占所有哺乳动物基因的2% - 4%。在这1880个人类直系同源基因中,70%出现在人类基因组的新基因图谱上,从而为人类和啮齿动物基因组的交叉参考提供了宝贵资源。除了比较作图,这些结果在解释人类和小鼠基因组序列同线区域之间的非编码序列保守性,以及基因表达阵列的设计和校准方面具有实际应用价值。