BMC Bioinformatics. 2014;15 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-15-S1-S2. Epub 2014 Jan 10.
Many efforts exist to design and implement approaches and tools for data capture, integration and analysis in the life sciences. Challenges are not only the heterogeneity, size and distribution of information sources, but also the danger of producing too many solutions for the same problem. Methodological, technological, infrastructural and social aspects appear to be essential for the development of a new generation of best practices and tools. In this paper, we analyse and discuss these aspects from different perspectives, by extending some of the ideas that arose during the NETTAB 2012 Workshop, making reference especially to the European context. First, relevance of using data and software models for the management and analysis of biological data is stressed. Second, some of the most relevant community achievements of the recent years, which should be taken as a starting point for future efforts in this research domain, are presented. Third, some of the main outstanding issues, challenges and trends are analysed. The challenges related to the tendency to fund and create large scale international research infrastructures and public-private partnerships in order to address the complex challenges of data intensive science are especially discussed. The needs and opportunities of Genomic Computing (the integration, search and display of genomic information at a very specific level, e.g. at the level of a single DNA region) are then considered. In the current data and network-driven era, social aspects can become crucial bottlenecks. How these may best be tackled to unleash the technical abilities for effective data integration and validation efforts is then discussed. Especially the apparent lack of incentives for already overwhelmed researchers appears to be a limitation for sharing information and knowledge with other scientists. We point out as well how the bioinformatics market is growing at an unprecedented speed due to the impact that new powerful in silico analysis promises to have on better diagnosis, prognosis, drug discovery and treatment, towards personalized medicine. An open business model for bioinformatics, which appears to be able to reduce undue duplication of efforts and support the increased reuse of valuable data sets, tools and platforms, is finally discussed.
许多努力旨在设计和实施生命科学领域的数据捕获、集成和分析方法和工具。挑战不仅在于信息源的异质性、规模和分布,还在于为同一个问题产生过多解决方案的危险。方法论、技术、基础设施和社会方面似乎是开发新一代最佳实践和工具的关键。在本文中,我们从不同的角度分析和讨论了这些方面,扩展了在 NETTAB 2012 研讨会上提出的一些想法,并特别参考了欧洲背景。首先,强调了使用数据和软件模型来管理和分析生物数据的相关性。其次,介绍了近年来一些最相关的社区成果,这些成果应作为未来该研究领域努力的起点。第三,分析了一些主要的突出问题、挑战和趋势。特别讨论了与资助和创建大型国际研究基础设施以及公私合作伙伴关系以应对数据密集型科学的复杂挑战相关的挑战。然后考虑了基因组计算(在非常特定的水平,例如在单个 DNA 区域的水平上,整合、搜索和显示基因组信息)的需求和机会。在当前数据和网络驱动的时代,社会方面可能成为关键瓶颈。然后讨论了如何最好地解决这些问题,以释放技术能力,进行有效的数据集成和验证工作。特别是,对于已经不堪重负的研究人员来说,缺乏激励似乎是与其他科学家分享信息和知识的一个限制。我们还指出,由于新的强大的计算机分析有望对更好的诊断、预后、药物发现和治疗、个性化医疗产生影响,生物信息学市场正以前所未有的速度增长。最后,讨论了一种用于生物信息学的开放式商业模式,这种商业模式似乎能够减少不必要的重复工作,并支持更有效地重用有价值的数据集、工具和平台。