• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

序列数据交换的标准化格式。

A standardized format for sequence data exchange.

作者信息

George D G, Mewes H W, Kihara H

机构信息

Protein Identification Resource, Georgetown University Medical Center, Washington, DC 20007.

出版信息

Protein Seq Data Anal. 1987;1(1):27-39.

PMID:3447154
Abstract

At present there is no agreement upon a standard format for the presentation of sequence data; each of the major sequence databases has adopted their own format. As a result, efforts to pool these data and to develop software to manipulate the data have been hampered. A significant amount of software development time must be invested to handle the incompatibilities among these formats before software to solve biologically interesting problems can be implemented. In principle, the development of a standard format by the database distributors would be the best solution. However, because the databases have invested years of effort in the development of procedures specifically tailored to their own format, they are reluctant to change. Insisting that they convert to a new format would place an extreme burden on the already overtaxed resources of these groups. Furthermore, for certain specialized applications it is more efficient to present the data in nonstandard formats. An alternative solution is presented here. Rather than develop a single standard format for all sequence data, a standardized exchange format has been developed. This format was designed to serve as a common interface between the major formats currently in use. Data can be easily converted to and from it without significant loss of information. This alleviates difficulties inherent in dealing with multiple formats while preserving the local formats of the various databases.

摘要

目前,对于序列数据的呈现尚无统一的标准格式;每个主要的序列数据库都采用了自己的格式。因此,汇总这些数据以及开发处理这些数据的软件的工作受到了阻碍。在能够实现解决生物学相关问题的软件之前,必须投入大量软件开发时间来处理这些格式之间的不兼容性。原则上,由数据库发行商开发标准格式将是最佳解决方案。然而,由于这些数据库在开发专门针对其自身格式的程序方面投入了多年努力,它们不愿改变。坚持让它们转换为新格式会给这些本就负担过重的团体资源带来极大压力。此外,对于某些特定的专业应用,以非标准格式呈现数据效率更高。本文提出了一种替代解决方案。不是为所有序列数据开发单一的标准格式,而是开发了一种标准化交换格式。这种格式旨在作为当前使用的主要格式之间的通用接口。数据可以轻松地与之相互转换,且不会有重大信息损失。这减轻了处理多种格式所固有的困难,同时保留了各个数据库的本地格式。

相似文献

1
A standardized format for sequence data exchange.序列数据交换的标准化格式。
Protein Seq Data Anal. 1987;1(1):27-39.
2
PROMPT: a protein mapping and comparison tool.提示:一种蛋白质图谱绘制与比较工具。
BMC Bioinformatics. 2006 Jul 4;7:331. doi: 10.1186/1471-2105-7-331.
3
MSQT for choosing SNP assays from multiple DNA alignments.用于从多个DNA比对中选择单核苷酸多态性(SNP)检测方法的多序列快速查询工具(MSQT)
Bioinformatics. 2007 Oct 15;23(20):2784-7. doi: 10.1093/bioinformatics/btm428. Epub 2007 Sep 4.
4
Building a BioChemformatics database.构建一个生物化学信息学数据库。
J Chem Inf Model. 2008 Dec;48(12):2404-13. doi: 10.1021/ci800128b.
5
cPath: open source software for collecting, storing, and querying biological pathways.cPath:用于收集、存储和查询生物途径的开源软件。
BMC Bioinformatics. 2006 Nov 13;7:497. doi: 10.1186/1471-2105-7-497.
6
[Computerization and the importance of information in health system, as in health care resources registry].[计算机化以及信息在卫生系统中的重要性,如在医疗保健资源登记方面]
Acta Med Croatica. 2005;59(3):251-7.
7
Paper2sequences: retrieval of sequences listed in a publication.论文2序列:检索出版物中列出的序列。
Appl Bioinformatics. 2003;2(2):113-6.
8
Spectra, chromatograms, Metadata: mzML-the standard data format for mass spectrometer output.光谱、色谱图、元数据:mzML——质谱仪输出的标准数据格式。
Methods Mol Biol. 2011;696:179-203. doi: 10.1007/978-1-60761-987-1_11.
9
GLYDE-an expressive XML standard for the representation of glycan structure.GLYDE——一种用于表示聚糖结构的富有表现力的XML标准。
Carbohydr Res. 2005 Dec 30;340(18):2802-7. doi: 10.1016/j.carres.2005.09.019. Epub 2005 Oct 20.
10
Evaluation of new multimedia formats for cancer communications.癌症传播新多媒体形式的评估。
J Med Internet Res. 2003 Jul-Sep;5(3):e16. doi: 10.2196/jmir.5.3.e16. Epub 2003 Aug 29.

引用本文的文献

1
The PIR protein sequence database.PIR蛋白质序列数据库。
Nucleic Acids Res. 1991 Apr 25;19 Suppl(Suppl):2231-36. doi: 10.1093/nar/19.suppl.2231.
2
The PIR-International Protein Sequence Database.PIR国际蛋白质序列数据库。
Nucleic Acids Res. 1992 May 11;20 Suppl(Suppl):2023-6. doi: 10.1093/nar/20.suppl.2023.