Seringhaus Michael R, Gerstein Mark B
Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520, USA.
BMC Bioinformatics. 2007 Jan 19;8:17. doi: 10.1186/1471-2105-8-17.
Scientific articles are tailored to present information in human-readable aliquots. Although the Internet has revolutionized the way our society thinks about information, the traditional text-based framework of the scientific article remains largely unchanged. This format imposes sharp constraints upon the type and quantity of biological information published today. Academic journals alone cannot capture the findings of modern genome-scale inquiry. Like many other disciplines, molecular biology is a science of facts: information inherently suited to database storage. In the past decade, a proliferation of public and private databases has emerged to house genome sequence, protein structure information, functional genomics data and more; these digital repositories are now a vital component of scientific communication. The next challenge is to integrate this vast and ever-growing body of information with academic journals and other media. To truly integrate scientific information we must modernize academic publishing to exploit the power of the Internet. This means more than online access to articles, hyperlinked references and web-based supplemental data; it means making articles fully computer-readable with intelligent markup and Structured Digital Abstracts.Here, we examine the changing roles of scholarly journals and databases. We present our vision of the optimal information architecture for the biosciences, and close with tangible steps to improve our handling of scientific information today while paving the way for an expansive central index in the future.
科学文章旨在以适合人类阅读的篇幅呈现信息。尽管互联网彻底改变了我们社会思考信息的方式,但科学文章基于文本的传统框架在很大程度上仍未改变。这种格式对当今发表的生物学信息的类型和数量施加了严格限制。仅学术期刊无法囊括现代基因组规模研究的成果。与许多其他学科一样,分子生物学是一门基于事实的科学:信息本质上适合数据库存储。在过去十年中,涌现出大量公共和私人数据库来存储基因组序列、蛋白质结构信息、功能基因组学数据等等;这些数字存储库如今已成为科学交流的重要组成部分。下一个挑战是将这一庞大且不断增长的信息主体与学术期刊及其他媒介整合起来。要真正整合科学信息,我们必须使学术出版现代化,以利用互联网的力量。这不仅仅意味着在线获取文章、超链接参考文献和基于网络的补充数据;还意味着通过智能标记和结构化数字摘要使文章完全具备计算机可读性。在此,我们审视学术期刊和数据库不断变化的角色。我们阐述了我们对生物科学最佳信息架构的愿景,并以切实可行的步骤作为结尾,这些步骤既能改进我们如今对科学信息的处理方式,又能为未来构建一个庞大的中央索引铺平道路。