Flasch Diane A, Rebman Ellen K, Olfson Emily H, Nguyen Khanh K, Geirut Lucasz E, Garland Megan C, Lindorfer Christi M, Laten Howard M
Department of Biology, Loyola University Chicago, Chicago, IL 60626, USA
In Silico Biol. 2008;8(5-6):531-43.
SIRE1 is a 2000-copy member of the Ty1/copia retroelement family found in the soybean genome and is closely related to sireviruses found in the genomes of other legumes. Although these elements closely resemble typical plant members of the Ty1/copia family, they are unusual in that they possess an envelope-like coding region immediately downstream of the reverse transcriptase gene. Despite its copy number, very few members of the SIRE1 family are currently present in publicly available genomic assemblies or draft contigs. However, fragments of family members are well-represented as BAC-ends in the GenBank Genome Survey Sequence database. This database was queried using the 5' and 3' ends of SIRE1 in order to catalog sequences into which SIRE1 members have integrated. Seven hundred and eighty-one unique SIRE1 insertions were identified and the majority of insertion sites constituted other repetitive elements, including Class I and Class II transposable elements and satellite DNAs. Ninety-four insertions were in single- or low-copy number sequences and three of these were homologous to characterized protein-coding genes. Examination of the ten bases flanking either side of SIRE1 revealed no clear consensus sequence, but the the distributions of A, C, G, and T at most of the positions were biased with strong statistical significance.
SIRE1是在大豆基因组中发现的Ty1/copia逆转录元件家族的一个含有2000个拷贝的成员,并且与在其他豆科植物基因组中发现的sire病毒密切相关。尽管这些元件与Ty1/copia家族的典型植物成员非常相似,但它们的不同寻常之处在于,在逆转录酶基因的紧下游拥有一个类似包膜的编码区域。尽管其拷贝数很多,但目前在公开可用的基因组组装或草图重叠群中,SIRE1家族的成员很少。然而,家族成员的片段在GenBank基因组调查序列数据库中作为BAC末端有很好的代表性。使用SIRE1的5'端和3'端查询该数据库,以便对SIRE1成员已整合其中的序列进行编目。鉴定出781个独特的SIRE1插入,并且大多数插入位点构成了其他重复元件,包括I类和II类转座元件以及卫星DNA。94个插入位于单拷贝或低拷贝数序列中并且其中三个与已鉴定的蛋白质编码基因同源。对SIRE1两侧的十个碱基的检查没有揭示出明确的共有序列,但是在大多数位置上A、C、G和T的分布存在具有强统计学意义的偏向性。