Hwang D M, Dempsey A A, Wang R X, Rezvani M, Barrans J D, Dai K S, Wang H Y, Ma H, Cukerman E, Liu Y Q, Gu J R, Zhang J H, Tsui S K, Waye M M, Fung K P, Lee C Y, Liew C C
Department of Laboratory Medicine, Centre for Cardiovascular Research, The Toronto Hospital, University of Toronto, Ontario, Canada.
Circulation. 1997 Dec 16;96(12):4146-203. doi: 10.1161/01.cir.96.12.4146.
Large-scale partial sequencing of cDNA libraries to generate expressed sequence tags (ESTs) is an effective means of discovering novel genes and characterizing transcription patterns in different tissues. To catalogue the identities and expression levels of genes in the cardiovascular system, we initiated large-scale sequencing and analysis of human cardiac cDNA libraries.
Using automated DNA sequencing, we generated 43,285 ESTs from human heart cDNA libraries. An additional 41,619 ESTs were retrieved from public databases, for a total of 84,904 ESTs representing more than 26 million nucleotides of raw cDNA sequence data from 13 independent cardiovascular system-based cDNA libraries. Of these, 55% matched to known genes in the Genbank/EMBL/DDBJ databases, 33% matched only to other ESTs, and 12% did not match to any known sequences (designated cardiovascular system-based ESTs, or CVbESTs). ESTs that matched to known genes were classified according to function, allowing for detection of differences in general transcription patterns between various tissues and developmental stages of the cardiovascular system. In silico Northern analysis of known gene matches identified widely expressed cardiovascular genes as well as genes putatively exhibiting greater tissue specificity or developmental stage specificity. More detailed analysis identified 48 genes potentially overexpressed in cardiac hypertrophy, at least 10 of which were previously documented as differentially expressed. Computer-based chromosomal localizations of 1048 cardiac ESTs were performed to further assist in the search for disease-related genes.
These data represent the most extensive compilation of cardiovascular gene expression information to date. They further demonstrate the untapped potential of genome research for investigating questions related to cardiovascular biology and represent a first-generation genome-based resource for molecular cardiovascular medicine.
对cDNA文库进行大规模部分测序以生成表达序列标签(EST)是发现新基因并描绘不同组织中转录模式的有效手段。为了编目心血管系统中基因的身份和表达水平,我们启动了对人类心脏cDNA文库的大规模测序和分析。
利用自动DNA测序技术,我们从人类心脏cDNA文库中生成了43,285个EST。另外从公共数据库中检索到41,619个EST,总共84,904个EST,代表了来自13个基于心血管系统独立cDNA文库的超过2600万个核苷酸的原始cDNA序列数据。其中,55%与Genbank/EMBL/DDBJ数据库中的已知基因匹配,33%仅与其他EST匹配,12%与任何已知序列都不匹配(称为基于心血管系统的EST,或CVbEST)。与已知基因匹配的EST根据功能进行分类,从而能够检测心血管系统不同组织和发育阶段之间一般转录模式的差异。对已知基因匹配进行的电子Northern分析确定了广泛表达的心血管基因以及可能具有更高组织特异性或发育阶段特异性的基因。更详细的分析确定了48个可能在心肌肥大中过度表达的基因,其中至少10个先前已被记录为差异表达。对1048个心脏EST进行了基于计算机的染色体定位,以进一步协助寻找与疾病相关的基因。
这些数据代表了迄今为止最广泛的心血管基因表达信息汇编。它们进一步证明了基因组研究在调查与心血管生物学相关问题方面尚未开发的潜力,并代表了分子心血管医学的第一代基于基因组的资源。