Murray-Rust Peter, Mitchell John B O, Rzepa Henry S
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
BMC Bioinformatics. 2005 Jul 18;6:180. doi: 10.1186/1471-2105-6-180.
The current methods of publishing chemical information in bioscience articles are analysed. Using 3 papers as use-cases, it is shown that conventional methods using human procedures, including cut-and-paste are time-consuming and introduce errors. The meaning of chemical terms and the identity of compounds is often ambiguous. valuable experimental data such as spectra and computational results are almost always omitted. We describe an Open XML architecture at proof-of-concept which addresses these concerns. Compounds are identified through explicit connection tables or links to persistent Open resources such as PubChem. It is argued that if publishers adopt these tools and protocols, then the quality and quantity of chemical information available to bioscientists will increase and the authors, publishers and readers will find the process cost-effective.
分析了目前在生物科学文章中发布化学信息的方法。以3篇论文为案例,结果表明,包括复制粘贴在内的传统人工操作方法既耗时又容易出错。化学术语的含义和化合物的身份常常模糊不清。有价值的实验数据,如光谱和计算结果几乎总是被省略。我们描述了一种概念验证阶段的开放XML架构,该架构解决了这些问题。通过显式连接表或指向诸如PubChem等持久开放资源的链接来识别化合物。有人认为,如果出版商采用这些工具和协议,那么生物科学家可获取的化学信息的质量和数量将会增加,并且作者、出版商和读者都会发现这个过程具有成本效益。