文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.

作者信息

Yooseph Shibu, Sutton Granger, Rusch Douglas B, Halpern Aaron L, Williamson Shannon J, Remington Karin, Eisen Jonathan A, Heidelberg Karla B, Manning Gerard, Li Weizhong, Jaroszewski Lukasz, Cieplak Piotr, Miller Christopher S, Li Huiying, Mashiyama Susan T, Joachimiak Marcin P, van Belle Christopher, Chandonia John-Marc, Soergel David A, Zhai Yufeng, Natarajan Kannan, Lee Shaun, Raphael Benjamin J, Bafna Vineet, Friedman Robert, Brenner Steven E, Godzik Adam, Eisenberg David, Dixon Jack E, Taylor Susan S, Strausberg Robert L, Frazier Marvin, Venter J Craig

机构信息

J. Craig Venter Institute, Rockville, Maryland, United States of America.

出版信息

PLoS Biol. 2007 Mar;5(3):e16. doi: 10.1371/journal.pbio.0050016.


DOI:10.1371/journal.pbio.0050016
PMID:17355171
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1821046/
Abstract

Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/2e82798360a8/pbio.0050016.g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/3cfaf15760bc/oceaniclogo.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/91351bf6861a/pbio.0050016.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/809536676f63/pbio.0050016.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/3c56c4c9d39b/pbio.0050016.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/e596c28266af/pbio.0050016.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/0e7b088aad1b/pbio.0050016.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/64c889cf3db7/pbio.0050016.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/467057e1f9c2/pbio.0050016.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/50727b2f3332/pbio.0050016.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/8b5df3d83a7e/pbio.0050016.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/b796e639a5fe/pbio.0050016.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/ab05f3c8a971/pbio.0050016.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/f6b2357bef72/pbio.0050016.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/53de1e1ea5e6/pbio.0050016.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/0bc4b23a84e1/pbio.0050016.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/23a4c59a359c/pbio.0050016.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/6008ee2e8280/pbio.0050016.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/186f9616fe93/pbio.0050016.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/2e82798360a8/pbio.0050016.g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/3cfaf15760bc/oceaniclogo.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/91351bf6861a/pbio.0050016.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/809536676f63/pbio.0050016.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/3c56c4c9d39b/pbio.0050016.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/e596c28266af/pbio.0050016.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/0e7b088aad1b/pbio.0050016.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/64c889cf3db7/pbio.0050016.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/467057e1f9c2/pbio.0050016.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/50727b2f3332/pbio.0050016.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/8b5df3d83a7e/pbio.0050016.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/b796e639a5fe/pbio.0050016.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/ab05f3c8a971/pbio.0050016.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/f6b2357bef72/pbio.0050016.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/53de1e1ea5e6/pbio.0050016.g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/0bc4b23a84e1/pbio.0050016.g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/23a4c59a359c/pbio.0050016.g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/6008ee2e8280/pbio.0050016.g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/186f9616fe93/pbio.0050016.g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4597/1821046/2e82798360a8/pbio.0050016.g018.jpg

相似文献

[1]
The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.

PLoS Biol. 2007-3

[2]
Probing metagenomics by rapid cluster analysis of very large datasets.

PLoS One. 2008

[3]
The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.

PLoS Biol. 2007-3

[4]
The Sorcerer II Global Ocean Sampling Expedition: metagenomic characterization of viruses within aquatic microbial samples.

PLoS One. 2008-1-23

[5]
In silico approach to designing rational metagenomic libraries for functional studies.

BMC Bioinformatics. 2017-5-22

[6]
Distribution of microbial terpenoid lipid cyclases in the global ocean metagenome.

ISME J. 2009-3

[7]
Gene network visualization and quantitative synteny analysis of more than 300 marine T4-like phage scaffolds from the GOS metagenome.

Mol Biol Evol. 2010-3-15

[8]
Protein family clustering for structural genomics.

J Mol Biol. 2005-10-28

[9]
The capsid of the T4 phage superfamily: the evolution, diversity, and structure of some of the most prevalent proteins in the biosphere.

Mol Biol Evol. 2008-7

[10]
Structural and functional diversity of the microbial kinome.

PLoS Biol. 2007-3

引用本文的文献

[1]
Integrative AI-Based Approaches to Connect the Multiome to Use Microbiome-Metabolome Interactive Outcome as Precision Medicine.

Methods Mol Biol. 2025

[2]
Naturally ornate RNA-only complexes revealed by cryo-EM.

Nature. 2025-5-6

[3]
From nets to networks: tools for deciphering phytoplankton metabolic interactions within communities and their global significance.

Philos Trans R Soc Lond B Biol Sci. 2024-9-9

[4]
Diversity and potential host-interactions of viruses inhabiting deep-sea seamount sediments.

Nat Commun. 2024-4-15

[5]
Marine picoplankton metagenomes and MAGs from eleven vertical profiles obtained by the Malaspina Expedition.

Sci Data. 2024-2-1

[6]
Seasonal patterns in microbial carbon and iron transporter expression in the Southern Ocean.

Microbiome. 2023-8-19

[7]
Identification of microbial metabolic functional guilds from large genomic datasets.

Front Microbiol. 2023-6-30

[8]
The Landscape of Global Ocean Microbiome: From Bacterioplankton to Biofilms.

Int J Mol Sci. 2023-3-30

[9]
In silico evaluation and selection of the best 16S rRNA gene primers for use in next-generation sequencing to detect oral bacteria and archaea.

Microbiome. 2023-3-23

[10]
Thermophilic Carboxylesterases from Hydrothermal Vents of the Volcanic Island of Ischia Active on Synthetic and Biobased Polymers and Mycotoxins.

Appl Environ Microbiol. 2023-2-28

本文引用的文献

[1]
The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.

PLoS Biol. 2007-3

[2]
Structural and functional diversity of the microbial kinome.

PLoS Biol. 2007-3

[3]
Update on the pfam5000 strategy for selection of structural genomics targets.

Conf Proc IEEE Eng Med Biol Soc. 2005

[4]
Genomic islands and the ecology and evolution of Prochlorococcus.

Science. 2006-3-24

[5]
Community genomics among stratified microbial assemblages in the ocean's interior.

Science. 2006-1-27

[6]
The impact of structural genomics: expectations and outcomes.

Science. 2006-1-20

[7]
MEROPS: the peptidase database.

Nucleic Acids Res. 2006-1-1

[8]
Database resources of the National Center for Biotechnology Information.

Nucleic Acids Res. 2006-1-1

[9]
Metagenomics: DNA sequencing of environmental samples.

Nat Rev Genet. 2005-11

[10]
Evidence of a large novel gene pool associated with prokaryotic genomic islands.

PLoS Genet. 2005-11

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索