Agüero F, Verdún R E, Frasch A C, Sánchez D O
Instituto de Investigaciones Biotecnológicas, Instituto Tecnológico de Chascomús, Universidad Nacional de General San Martín, Consejo Nacional de Investigaciones Científicas y Técnicas, San Martín, Provincia de Buenos Aires, Argentina.
Genome Res. 2000 Dec;10(12):1996-2005. doi: 10.1101/gr.gr-1463r.
A random sequence survey of the genome of Trypanosoma cruzi, the agent of Chagas disease, was performed and 11,459 genomic sequences were obtained, resulting in approximately 4.3 Mb of readable sequences or approximately 10% of the parasite haploid genome. The estimated total GC content was 50.9%, with a high representation of A and T di- and trinucleotide repeats. Out of the estimated 5000 parasite genes, 947 putative new genes were identified. Another 1723 sequences corresponded to genes detected previously in T. cruzi through expression sequence tag analysis. 7735 sequences had no matches in the database, but the presence of open reading frames that passed Fickett's test suggests that some might contain coding DNA. The survey was highly redundant, with approximately 35% of the sequences included in a few large sequence families. Some of them code for protein families present in dozens of copies, including proteins essential for parasite survival and retrotransposons. Other sequence families include repetitive DNA present in thousands of copies per haploid genome. Some families in the latter group are new, parasite-specific, repetitive DNAs. These results suggest that T. cruzi could constitute an interesting model to analyze gene and genome evolution due to its plasticity in terms of sequence amplification and divergence. Additional information can be found at http://www.iib.unsam.edu.ar/tcruzi.gss. html.
对恰加斯病病原体克氏锥虫的基因组进行了随机序列调查,获得了11459个基因组序列,产生了约4.3 Mb的可读序列,约占该寄生虫单倍体基因组的10%。估计的总GC含量为50.9%,A和T二核苷酸及三核苷酸重复序列的比例很高。在估计的5000个寄生虫基因中,鉴定出947个推定的新基因。另外1723个序列对应于先前通过表达序列标签分析在克氏锥虫中检测到的基因。7735个序列在数据库中没有匹配项,但存在通过菲克特检验的开放阅读框,这表明其中一些可能包含编码DNA。该调查具有高度冗余性,约35%的序列包含在几个大的序列家族中。其中一些编码存在数十个拷贝的蛋白质家族,包括寄生虫生存所必需的蛋白质和逆转录转座子。其他序列家族包括每个单倍体基因组中存在数千个拷贝的重复DNA。后一组中的一些家族是新的、寄生虫特异性的重复DNA。这些结果表明,由于克氏锥虫在序列扩增和分化方面的可塑性,它可能构成一个分析基因和基因组进化的有趣模型。可在http://www.iib.unsam.edu.ar/tcruzi.gss.html上找到更多信息。