• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

真核生物结构基因组学的自动靶点选择

Automatic target selection for structural genomics on eukaryotes.

作者信息

Liu Jinfeng, Hegyi Hedi, Acton Thomas B, Montelione Gaetano T, Rost Burkhard

机构信息

CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA.

出版信息

Proteins. 2004 Aug 1;56(2):188-200. doi: 10.1002/prot.20012.

DOI:10.1002/prot.20012
PMID:15211504
Abstract

A central goal of structural genomics is to experimentally determine representative structures for all protein families. At least 14 structural genomics pilot projects are currently investigating the feasibility of high-throughput structure determination; the National Institutes of Health funded nine of these in the United States. Initiatives differ in the particular subset of "all families" on which they focus. At the NorthEast Structural Genomics consortium (NESG), we target eukaryotic protein domain families. The automatic target selection procedure has three aims: 1) identify all protein domain families from currently five entirely sequenced eukaryotic target organisms based on their sequence homology, 2) discard those families that can be modeled on the basis of structural information already present in the PDB, and 3) target representatives of the remaining families for structure determination. To guarantee that all members of one family share a common foldlike region, we had to begin by dissecting proteins into structural domain-like regions before clustering. Our hierarchical approach, CHOP, utilizing homology to PrISM, Pfam-A, and SWISS-PROT chopped the 103,796 eukaryotic proteins/ORFs into 247,222 fragments. Of these fragments, 122,999 appeared suitable targets that were grouped into >27,000 singletons and >18,000 multifragment clusters. Thus, our results suggested that it might be necessary to determine >40,000 structures to minimally cover the subset of five eukaryotic proteomes.

摘要

结构基因组学的一个核心目标是通过实验确定所有蛋白质家族的代表性结构。目前至少有14个结构基因组学试点项目正在研究高通量结构测定的可行性;美国国立卫生研究院资助了其中的9个项目。不同的项目在它们所关注的“所有家族”的特定子集中存在差异。在东北结构基因组学联盟(NESG),我们的目标是真核生物蛋白质结构域家族。自动靶标选择程序有三个目标:1)基于序列同源性,从目前已完全测序的五个真核生物靶标生物体中识别出所有蛋白质结构域家族;2)舍弃那些可以根据蛋白质数据银行(PDB)中已有的结构信息进行建模的家族;3)将其余家族的代表作为结构测定的靶标。为了确保一个家族的所有成员都共享一个共同的折叠样区域,我们必须在聚类之前先将蛋白质分解为结构域样区域。我们的分层方法CHOP利用与PrISM、Pfam-A和SWISS-PROT的同源性,将103,796个真核生物蛋白质/开放阅读框(ORF)切割成247,222个片段。在这些片段中,122,999个似乎是合适的靶标,它们被分组为>27,000个单例和>18,000个多片段簇。因此,我们的结果表明,可能有必要测定>40,000个结构,以最少覆盖五个真核生物蛋白质组的子集。

相似文献

1
Automatic target selection for structural genomics on eukaryotes.真核生物结构基因组学的自动靶点选择
Proteins. 2004 Aug 1;56(2):188-200. doi: 10.1002/prot.20012.
2
The protein target list of the Northeast Structural Genomics Consortium.东北结构基因组学联盟的蛋白质靶标列表。
Proteins. 2004 Aug 1;56(2):181-7. doi: 10.1002/prot.20091.
3
Target space for structural genomics revisited.重新审视结构基因组学的目标空间。
Bioinformatics. 2002 Jul;18(7):922-33. doi: 10.1093/bioinformatics/18.7.922.
4
CHOP proteins into structural domain-like fragments.CHOP将蛋白质切割成结构域样片段。
Proteins. 2004 May 15;55(3):678-88. doi: 10.1002/prot.20095.
5
Coverage of protein sequence space by current structural genomics targets.当前结构基因组学目标对蛋白质序列空间的覆盖情况。
J Struct Funct Genomics. 2003;4(2-3):47-55. doi: 10.1023/a:1026156025612.
6
Comprehensive analysis of orthologous protein domains using the HOPS database.使用HOPS数据库对直系同源蛋白结构域进行综合分析。
Genome Res. 2003 Oct;13(10):2353-62. doi: 10.1101/gr1305203.
7
Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches.结构基因组学靶点选择策略的影响:Pfam5000、全基因组和随机方法。
Proteins. 2005 Jan 1;58(1):166-79. doi: 10.1002/prot.20298.
8
[Comparative genomics and proteomics of Drosophila, Brenner's nematode, and Arabidopsis: identification of functionally similar genes and proteins of meiotic chromosome synapsis].[果蝇、布伦纳线虫和拟南芥的比较基因组学与蛋白质组学:减数分裂染色体联会功能相似基因和蛋白质的鉴定]
Genetika. 2002 Aug;38(8):1078-89.
9
PSI-2: structural genomics to cover protein domain family space.PSI-2:用于覆盖蛋白质结构域家族空间的结构基因组学。
Structure. 2009 Jun 10;17(6):869-81. doi: 10.1016/j.str.2009.03.015.
10
Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space.对203个基因组的全面基因组分析为结构基因组学提供了关于蛋白质家族空间的新见解。
Nucleic Acids Res. 2006 Feb 15;34(3):1066-80. doi: 10.1093/nar/gkj494. Print 2006.

引用本文的文献

1
Nearest neighbor search on embeddings rapidly identifies distant protein relations.对嵌入进行最近邻搜索可快速识别远距离蛋白质关系。
Front Bioinform. 2022 Nov 17;2:1033775. doi: 10.3389/fbinf.2022.1033775. eCollection 2022.
2
Implementation of homology based and non-homology based computational methods for the identification and annotation of orphan enzymes: using Mycobacterium tuberculosis H37Rv as a case study.基于同源性和非同源性的计算方法在孤儿酶的鉴定和注释中的应用:以结核分枝杆菌 H37Rv 为例。
BMC Bioinformatics. 2020 Oct 19;21(1):466. doi: 10.1186/s12859-020-03794-x.
3
The accurate assessment of small-angle X-ray scattering data.
小角X射线散射数据的准确评估。
Acta Crystallogr D Biol Crystallogr. 2015 Jan 1;71(Pt 1):45-56. doi: 10.1107/S1399004714010876.
4
FreeContact: fast and free software for protein contact prediction from residue co-evolution.FreeContact:用于基于残基共进化预测蛋白质接触的快速免费软件。
BMC Bioinformatics. 2014 Mar 26;15:85. doi: 10.1186/1471-2105-15-85.
5
Selecting targets from eukaryotic parasites for structural genomics and drug discovery.从真核寄生虫中选择用于结构基因组学和药物研发的靶点。
Methods Mol Biol. 2014;1140:53-9. doi: 10.1007/978-1-4939-0354-2_4.
6
High throughput platforms for structural genomics of integral membrane proteins.高通量平台用于整体膜蛋白的结构基因组学。
Curr Opin Struct Biol. 2011 Aug;21(4):517-22. doi: 10.1016/j.sbi.2011.07.001. Epub 2011 Jul 30.
7
Preparation of protein samples for NMR structure, function, and small-molecule screening studies.用于核磁共振结构、功能和小分子筛选研究的蛋白质样品制备。
Methods Enzymol. 2011;493:21-60. doi: 10.1016/B978-0-12-381274-2.00002-9.
8
PSI:Biology-materials repository: a biologist's resource for protein expression plasmids.PSI:生物学-材料储存库:蛋白质表达质粒的生物学家资源。
J Struct Funct Genomics. 2011 Jul;12(2):55-62. doi: 10.1007/s10969-011-9100-8. Epub 2011 Mar 1.
9
XANNpred: neural nets that predict the propensity of a protein to yield diffraction-quality crystals.XANNpred:预测蛋白质产生衍射质量晶体倾向的神经网络。
Proteins. 2011 Apr;79(4):1027-33. doi: 10.1002/prot.22914. Epub 2011 Jan 18.
10
The New York Consortium on Membrane Protein Structure (NYCOMPS): a high-throughput platform for structural genomics of integral membrane proteins.纽约膜蛋白结构联盟(NYCOMPS):一个用于整合膜蛋白结构基因组学的高通量平台。
J Struct Funct Genomics. 2010 Sep;11(3):191-9. doi: 10.1007/s10969-010-9094-7. Epub 2010 Aug 6.