Newberg Lee A, Lawrence Charles E
New York State Department of Health Wadsworth Center & Rensselaer Polytechnic Institute, USA.
Stat Appl Genet Mol Biol. 2004;3:Article23. doi: 10.2202/1544-6115.1065. Epub 2004 Sep 30.
Under the assumption that a significant motivation for sequencing the genomes of mammals is the resulting ability to help us locate and characterize functional DNA segments shared with humans, we have developed a statistical analysis to quantify the expected advantage. Examining uncertainty in terms of the width of a confidence interval, we show that uncertainty in the rate of nucleotide mutation can be shrunk by a factor of nearly four when nine mammals; human, chimpanzee, baboon, cat, dog, cow, pig, rat, mouse; are used instead of just two; human and mouse. Contrastingly, we show confidence interval shrinkage by a factor of only 1.5 for measurements of the distribution of nucleotides at an aligned sequence site. These additional genomes should greatly help in identifying conserved DNA sites, but would be much less effective at precisely describing the expected pattern of nucleotides at those sites.
假设对哺乳动物基因组进行测序的一个重要动机是由此产生的帮助我们定位和表征与人类共享的功能性DNA片段的能力,我们已经开展了一项统计分析来量化预期优势。通过考察置信区间宽度方面的不确定性,我们发现,当使用九种哺乳动物(人类、黑猩猩、狒狒、猫、狗、牛、猪、大鼠、小鼠)而非仅两种(人类和小鼠)时,核苷酸突变率的不确定性可缩小近四倍。相比之下,对于比对序列位点处核苷酸分布的测量,我们发现置信区间仅缩小了1.5倍。这些额外的基因组应能极大地帮助识别保守的DNA位点,但在精确描述这些位点处核苷酸的预期模式方面效果要差得多。