Owens John
NCI-Frederick, National Institutes of Health, Frederick, MD, USA.
Methods Mol Biol. 2009;525:569-80, xiv. doi: 10.1007/978-1-59745-554-1_32.
Technological advances in the acquisition of DNA and protein sequence information and the resulting onrush of data can quickly overwhelm the scientist unprepared for the volume of information that must be evaluated and carefully dissected to discover its significance. Few laboratories have the luxury of dedicated personnel to organize, analyze, or consistently record a mix of arriving sequence data. A methodology based on a modern relational-database manager is presented that is both a natural storage vessel for antibody sequence information and a conduit for organizing and exploring sequence data and accompanying annotation text. The expertise necessary to implement such a plan is equal to that required by electronic word processors or spreadsheet applications. Antibody sequence projects maintained as independent databases are selectively unified by the relational-database manager into larger database families that contribute to local analyses, reports, interactive HTML pages, or exported to facilities dedicated to sophisticated sequence analysis techniques. Database files are transposable among current versions of Microsoft, Macintosh, and UNIX operating systems.
在获取DNA和蛋白质序列信息方面的技术进步以及随之而来的数据洪流,会迅速让那些没有为必须评估和仔细剖析以发现其意义的信息量做好准备的科学家应接不暇。很少有实验室能奢侈地配备专门人员来整理、分析或持续记录不断涌入的各种序列数据。本文介绍了一种基于现代关系型数据库管理器的方法,它既是抗体序列信息的天然存储库,也是组织和探索序列数据及相关注释文本的渠道。实施这样一个计划所需的专业知识与使用电子文字处理器或电子表格应用程序所需的专业知识相当。作为独立数据库维护的抗体序列项目,由关系型数据库管理器有选择地统一成更大的数据库家族,这些数据库家族有助于进行本地分析、生成报告、创建交互式HTML页面,或导出到专门用于复杂序列分析技术的机构。数据库文件可在Microsoft、Macintosh和UNIX操作系统的当前版本之间转换。