Luan Chi-Hao, Qiu Shihong, Finley James B, Carson Mike, Gray Rita J, Huang Wenying, Johnson David, Tsao Jun, Reboul Jérôme, Vaglio Philippe, Hill David E, Vidal Marc, Delucas Lawrence J, Luo Ming
Center for Biophysical Sciences and Engineering, Southeast Collaboratory for Structural Genomics, University of Alabama at Birmingham, Birmingham, Alabama 35294, USA.
Genome Res. 2004 Oct;14(10B):2102-10. doi: 10.1101/gr.2520504.
Proteome-scale studies of protein three-dimensional structures should provide valuable information for both investigating basic biology and developing therapeutics. Critical for these endeavors is the expression of recombinant proteins. We selected Caenorhabditis elegans as our model organism in a structural proteomics initiative because of the high quality of its genome sequence and the availability of its ORFeome, protein-encoding open reading frames (ORFs), in a flexible recombinational cloning format. We developed a robotic pipeline for recombinant protein expression, applying the Gateway cloning/expression technology and utilizing a stepwise automation strategy on an integrated robotic platform. Using the pipeline, we have carried out heterologous protein expression experiments on 10,167 ORFs of C. elegans. With one expression vector and one Escherichia coli strain, protein expression was observed for 4854 ORFs, and 1536 were soluble. Bioinformatics analysis of the data indicates that protein hydrophobicity is a key determining factor for an ORF to yield a soluble expression product. This protein expression effort has investigated the largest number of genes in any organism to date. The pipeline described here is applicable to high-throughput expression of recombinant proteins for other species, both prokaryotic and eukaryotic, provided that ORFeome resources become available.
蛋白质三维结构的蛋白质组规模研究应为基础生物学研究和治疗方法开发提供有价值的信息。这些研究的关键在于重组蛋白的表达。我们在一项结构蛋白质组学计划中选择秀丽隐杆线虫作为模式生物,因为其基因组序列质量高,且其开放阅读框组(ORFeome),即蛋白质编码开放阅读框(ORF),以灵活的重组克隆形式存在。我们开发了一种用于重组蛋白表达的自动化流程,应用Gateway克隆/表达技术,并在集成机器人平台上采用逐步自动化策略。利用该流程,我们对秀丽隐杆线虫的10167个ORF进行了异源蛋白表达实验。使用一种表达载体和一种大肠杆菌菌株,观察到4854个ORF有蛋白表达,其中1536个是可溶的。对数据的生物信息学分析表明,蛋白质疏水性是一个ORF产生可溶表达产物的关键决定因素。这项蛋白质表达工作研究了迄今为止任何生物体中数量最多的基因。这里描述的流程适用于其他原核和真核物种重组蛋白的高通量表达,前提是有开放阅读框组资源可用。