Sonstegard Tad S, Capuco Anthony V, White Joseph, Van Tassell Curtis P, Connor Erin E, Cho Jennifer, Sultana Razvan, Shade Larry, Wray James E, Wells Kevin D, Quackenbush John
USDA, ARS, Beltsville Agricultural Research Center, Beltsville, Maryland 20705, USA.
Mamm Genome. 2002 Jul;13(7):373-9. doi: 10.1007/s00335-001-2145-4.
Functional genomic studies of the mammary gland require an appropriate collection of cDNA sequences to assess gene expression patterns from the different developmental and operational states of underlying cell types. To better capture the range of gene expression, a normalized cDNA library was constructed from pooled bovine mammary tissues, and 23,202 expressed sequence tags (EST) were produced and deposited into GenBank. Assembly of these EST with sequences in the Bos taurus Gene Index (BtGI) helped to form 5751 of the current 23,883 tentative consensus (TC) sequences. The majority (87%) of these 5751 assemblies contained only one to three mammary-derived EST. In contrast, 18% of the mammary EST assembled with TC sequences corresponding to 12 genes. These results suggest library normalization was only partially effective, because the reduction in EST for genes abundantly transcribed during lactation could be attributed to pooling. For better assessment of novel content in the mammary library and to add to existing annotation of all bovine sequence elements, gene ontology assignments, and comparative sequence analyses against human genome sequence, human and rodent gene indices, and an index of orthologous alignments of genes across eukaryotes (TOGA) were performed, and results were added to existing BtGI annotation. Over 35,000 of the bovine elements significantly matched human genome sequence, and the positions of some alignments (3%) were unique relative to those using human expressed sequences. Because 3445 TC sequences had no significant match with any data set, mammary-derived cDNA clones representing 23 of these elements were analyzed further for expression and novelty. Only one clone met criteria suggesting the corresponding gene was a divergent ortholog or expressed sequence unique to cattle. These results demonstrate that bovine sequence expression data serve as a resource for characterizing mammalian transcriptomes and identifying those genes potentially unique to ruminants.
乳腺的功能基因组研究需要收集合适的cDNA序列,以评估基础细胞类型在不同发育和功能状态下的基因表达模式。为了更好地捕捉基因表达范围,从混合的牛乳腺组织构建了一个标准化的cDNA文库,并产生了23202个表达序列标签(EST),并提交到GenBank。将这些EST与牛基因索引(BtGI)中的序列进行组装,有助于形成当前23883个初步一致性(TC)序列中的5751个。这5751个组装序列中的大多数(87%)仅包含一到三个来自乳腺的EST。相比之下,18%的乳腺EST与对应12个基因的TC序列组装在一起。这些结果表明文库标准化仅部分有效,因为泌乳期间大量转录的基因的EST减少可能归因于混合。为了更好地评估乳腺文库中的新内容,并增加对所有牛序列元件的现有注释、基因本体分配以及针对人类基因组序列、人类和啮齿动物基因索引以及跨真核生物的基因直系同源比对索引(TOGA)的比较序列分析,进行了这些分析,并将结果添加到现有的BtGI注释中。超过35000个牛元件与人类基因组序列有显著匹配,并且一些比对(3%)的位置相对于使用人类表达序列的比对是独特的。由于3445个TC序列与任何数据集都没有显著匹配,对代表其中23个元件的乳腺来源的cDNA克隆进行了进一步的表达和新颖性分析。只有一个克隆符合标准,表明相应的基因是一个分歧的直系同源基因或牛特有的表达序列。这些结果表明,牛序列表达数据可作为表征哺乳动物转录组和鉴定反刍动物潜在特有基因的资源。