Chemistry, Faculty of Natural and Environmental Sciences, University of Southampton, University Road, Highfield, Southampton SO17 1BJ, UK.
Chem Soc Rev. 2013 Aug 21;42(16):6754-76. doi: 10.1039/c3cs60050e.
Recently, a number of organisations have called for open access to scientific information and especially to the data obtained from publicly funded research, among which the Royal Society report and the European Commission press release are particularly notable. It has long been accepted that building research on the foundations laid by other scientists is both effective and efficient. Regrettably, some disciplines, chemistry being one, have been slow to recognise the value of sharing and have thus been reluctant to curate their data and information in preparation for exchanging it. The very significant increases in both the volume and the complexity of the datasets produced has encouraged the expansion of e-Research, and stimulated the development of methodologies for managing, organising, and analysing "big data". We review the evolution of cheminformatics, the amalgam of chemistry, computer science, and information technology, and assess the wider e-Science and e-Research perspective. Chemical information does matter, as do matters of communicating data and collaborating with data. For chemistry, unique identifiers, structure representations, and property descriptors are essential to the activities of sharing and exchange. Open science entails the sharing of more than mere facts: for example, the publication of negative outcomes can facilitate better understanding of which synthetic routes to choose, an aspiration of the Dial-a-Molecule Grand Challenge. The protagonists of open notebook science go even further and exchange their thoughts and plans. We consider the concepts of preservation, curation, provenance, discovery, and access in the context of the research lifecycle, and then focus on the role of metadata, particularly the ontologies on which the emerging chemical Semantic Web will depend. Among our conclusions, we present our choice of the "grand challenges" for the preservation and sharing of chemical information.
最近,许多组织呼吁开放获取科学信息,特别是公开资助研究获得的数据,其中英国皇家学会的报告和欧盟委员会的新闻稿尤为引人注目。人们早就认识到,在其他科学家奠定的基础上开展研究既有效又高效。遗憾的是,一些学科,如化学,迟迟没有认识到共享的价值,因此不愿意整理他们的数据和信息,为交流做准备。数据集的数量和复杂性的显著增加,鼓励了电子研究的扩展,并刺激了用于管理、组织和分析“大数据”的方法的发展。我们回顾了化学信息学的发展,它是化学、计算机科学和信息技术的融合,并评估了更广泛的电子科学和电子研究视角。化学信息很重要,数据的交流和协作也很重要。对于化学来说,唯一标识符、结构表示和属性描述符是共享和交流活动的基础。开放科学不仅仅涉及事实的共享:例如,负面结果的发表可以促进更好地理解选择哪种合成路线,这是“分子拨号”大挑战的目标之一。开放笔记本科学的倡导者更进一步,交流他们的想法和计划。我们考虑了在研究生命周期背景下的保存、策展、出处、发现和访问的概念,然后专注于元数据的作用,特别是新兴的化学语义网将依赖的本体。在我们的结论中,我们提出了我们对化学信息保存和共享的“大挑战”的选择。