• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

马尾藻海资源分析及其对数据库构成的影响。

An analysis of the Sargasso Sea resource and the consequences for database composition.

作者信息

Tress Michael L, Cozzetto Domenico, Tramontano Anna, Valencia Alfonso

机构信息

Protein Design Group, CNB-CSIC, Calle Darwin, Cantoblanco 28049 Madrid, Spain.

出版信息

BMC Bioinformatics. 2006 Apr 19;7:213. doi: 10.1186/1471-2105-7-213.

DOI:10.1186/1471-2105-7-213
PMID:16623953
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1513258/
Abstract

BACKGROUND

The environmental sequencing of the Sargasso Sea has introduced a huge new resource of genomic information. Unlike the protein sequences held in the current searchable databases, the Sargasso Sea sequences originate from a single marine environment and have been sequenced from species that are not easily obtainable by laboratory cultivation. The resource also contains very many fragments of whole protein sequences, a side effect of the shotgun sequencing method.These sequences form a significant addendum to the current searchable databases but also present us with some intrinsic difficulties. While it is important to know whether it is possible to assign function to these sequences with the current methods and whether they will increase our capacity to explore sequence space, it is also interesting to know how current bioinformatics techniques will deal with the new sequences in the resource.

RESULTS

The Sargasso Sea sequences seem to introduce a bias that decreases the potential of current methods to propose structure and function for new proteins. In particular the high proportion of sequence fragments in the resource seems to result in poor quality multiple alignments.

CONCLUSION

These observations suggest that the new sequences should be used with care, especially if the information is to be used in large scale analyses. On a positive note, the results may just spark improvements in computational and experimental methods to take into account the fragments generated by environmental sequencing techniques.

摘要

背景

马尾藻海的环境测序引入了一个巨大的新基因组信息资源。与当前可搜索数据库中保存的蛋白质序列不同,马尾藻海序列源自单一海洋环境,并且是从实验室培养不易获得的物种中测序得到的。该资源还包含许多完整蛋白质序列的片段,这是鸟枪法测序方法的一个副作用。这些序列构成了当前可搜索数据库的重要补充,但也给我们带来了一些内在困难。虽然了解使用当前方法是否能够为这些序列赋予功能以及它们是否会增加我们探索序列空间的能力很重要,但了解当前生物信息学技术将如何处理该资源中的新序列也很有趣。

结果

马尾藻海序列似乎引入了一种偏差,降低了当前方法为新蛋白质提出结构和功能的潜力。特别是该资源中序列片段的高比例似乎导致了质量较差的多序列比对。

结论

这些观察结果表明,应谨慎使用新序列,特别是如果要在大规模分析中使用这些信息。从积极的方面来看,这些结果可能会促使计算和实验方法得到改进,以考虑环境测序技术产生的片段。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/35d136f7c884/1471-2105-7-213-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/a28a95531760/1471-2105-7-213-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/2ca21f3d3519/1471-2105-7-213-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/4ae8b456204e/1471-2105-7-213-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/bdf4c601c412/1471-2105-7-213-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/35d136f7c884/1471-2105-7-213-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/a28a95531760/1471-2105-7-213-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/2ca21f3d3519/1471-2105-7-213-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/4ae8b456204e/1471-2105-7-213-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/bdf4c601c412/1471-2105-7-213-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e27b/1513258/35d136f7c884/1471-2105-7-213-6.jpg

相似文献

1
An analysis of the Sargasso Sea resource and the consequences for database composition.马尾藻海资源分析及其对数据库构成的影响。
BMC Bioinformatics. 2006 Apr 19;7:213. doi: 10.1186/1471-2105-7-213.
2
The tryptophan pathway genes of the Sargasso Sea metagenome: new operon structures and the prevalence of non-operon organization.马尾藻海宏基因组色氨酸代谢途径基因:新的操纵子结构和非操纵子组织的普遍性。
Genome Biol. 2008 Jan 27;9(1):R20. doi: 10.1186/gb-2008-9-1-r20.
3
Fishing for biodiversity: novel methanopterin-linked C transfer genes deduced from the Sargasso Sea metagenome.探寻生物多样性:从马尾藻海宏基因组中推导出来的新型与甲烷蝶呤相连的碳转移基因。
Environ Microbiol. 2005 Dec;7(12):1909-16. doi: 10.1111/j.1462-2920.2005.00798.x.
4
The microbial selenoproteome of the Sargasso Sea.马尾藻海的微生物硒蛋白组
Genome Biol. 2005;6(4):R37. doi: 10.1186/gb-2005-6-4-r37. Epub 2005 Mar 29.
5
Improvements of high-throughput culturing yielded novel SAR11 strains and other abundant marine bacteria from the Oregon coast and the Bermuda Atlantic Time Series study site.高通量培养技术的改进从俄勒冈海岸和百慕大大西洋时间序列研究站点分离出了新型SAR11菌株和其他丰富的海洋细菌。
ISME J. 2007 Aug;1(4):361-71. doi: 10.1038/ismej.2007.49. Epub 2007 Jul 5.
6
7
High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus.高通量测序及深海热液喷口贻贝 Bathymodiolus azoricus 鳃组织转录组分析。
BMC Genomics. 2010 Oct 11;11:559. doi: 10.1186/1471-2164-11-559.
8
Pandit: a database of protein and associated nucleotide domains with inferred trees.潘迪特:一个带有推断树的蛋白质及相关核苷酸结构域数据库。
Bioinformatics. 2003 Aug 12;19(12):1556-63. doi: 10.1093/bioinformatics/btg188.
9
MetaGene: prokaryotic gene finding from environmental genome shotgun sequences.MetaGene:从环境基因组鸟枪法测序中寻找原核生物基因
Nucleic Acids Res. 2006;34(19):5623-30. doi: 10.1093/nar/gkl723. Epub 2006 Oct 5.
10
Picoeukaryotic sequences in the Sargasso sea metagenome.马尾藻海宏基因组中的小型真核生物序列。
Genome Biol. 2008 Jan 7;9(1):R5. doi: 10.1186/gb-2008-9-1-r5.

引用本文的文献

1
Signal processing for metagenomics: extracting information from the soup.宏基因组学的信号处理:从汤羹中提取信息。
Curr Genomics. 2009 Nov;10(7):493-510. doi: 10.2174/138920209789208255.
2
Probing metagenomics by rapid cluster analysis of very large datasets.通过对超大型数据集进行快速聚类分析来探索宏基因组学。
PLoS One. 2008;3(10):e3375. doi: 10.1371/journal.pone.0003375. Epub 2008 Oct 10.
3
The next meta-challenge for Bioinformatics.生物信息学的下一个元挑战。

本文引用的文献

1
Environments shape the nucleotide composition of genomes.环境塑造基因组的核苷酸组成。
EMBO Rep. 2005 Dec;6(12):1208-13. doi: 10.1038/sj.embor.7400538.
2
Domain definition and target classification for CASP6.CASP6的结构域定义与靶标分类
Proteins. 2005;61 Suppl 7:8-18. doi: 10.1002/prot.20717.
3
Bioinformatics for whole-genome shotgun sequencing of microbial communities.用于微生物群落全基因组鸟枪法测序的生物信息学
Bioinformation. 2008 May 29;2(8):358-62. doi: 10.6026/97320630002358.
PLoS Comput Biol. 2005 Jul;1(2):106-12. doi: 10.1371/journal.pcbi.0010024.
4
New insights into metabolic properties of marine bacteria encoding proteorhodopsins.对编码视紫红质的海洋细菌代谢特性的新见解。
PLoS Biol. 2005 Aug;3(8):e273. doi: 10.1371/journal.pbio.0030273. Epub 2005 Jul 19.
5
Analysis of gene islands involved in methanopterin-linked C1 transfer reactions reveals new functions and provides evolutionary insights.对参与喋呤连接的C1转移反应的基因岛进行分析,揭示了新功能并提供了进化方面的见解。
J Bacteriol. 2005 Jul;187(13):4607-14. doi: 10.1128/JB.187.13.4607-4614.2005.
6
Comparative metagenomics of microbial communities.微生物群落的比较宏基因组学
Science. 2005 Apr 22;308(5721):554-7. doi: 10.1126/science.1107851.
7
Identification of a new family of putative PD-(D/E)XK nucleases with unusual phylogenomic distribution and a new type of the active site.鉴定具有异常系统基因组分布的新型假定PD-(D/E)XK核酸酶家族及一种新型活性位点。
BMC Genomics. 2005 Feb 18;6:21. doi: 10.1186/1471-2164-6-21.
8
Relationship between multiple sequence alignments and quality of protein comparative models.多序列比对与蛋白质比较模型质量之间的关系。
Proteins. 2005 Jan 1;58(1):151-7. doi: 10.1002/prot.20284.
9
Miraculous catch of iron-sulfur protein sequences in the Sargasso Sea.
FEBS Lett. 2004 Jul 16;570(1-3):1-6. doi: 10.1016/j.febslet.2004.06.030.
10
Swiss-Prot: juggling between evolution and stability.瑞士蛋白质数据库:在进化与稳定性之间权衡
Brief Bioinform. 2004 Mar;5(1):39-55. doi: 10.1093/bib/5.1.39.