Pipan Veronika, Kunej Tanja
Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Slovenia.
Discoveries (Craiova). 2015 May 19;3(2):e44. doi: 10.15190/d.2015.36.
The number of published reports using next-generation sequencing (NGS) technology in cancer research is increasing. These technologies generate large amounts of data that need to be appropriately presented and available to other researchers for further use. Our goal was to create a comprehensive database with single nucleotide polymorphisms (SNPs) associated with different types of cancer to integrate them to our bioinformatics tools. We reviewed more than 200 scientific papers and extracted relevant information on mutations detected by NGS technology. The current version of the database contains more than 100.000 mutations in more than 70 types of cancer. However, our review of NGS studies revealed great variation in presentation of NGS data in scientific literature with almost no effort for standardization of the data format. NGS results are published in a variety of forms which hinders the gathering of information. Therefore we suggested a uniform format for presenting the NGS data. This will allow faster database development, easier access and data sharing between the laboratories. The database will be a useful tool to many researchers in the field of cancer research and can be a base for a range of studies such as genome-wide association studies, microRNA target binding, and development of cancer biomarkers research.
在癌症研究中使用下一代测序(NGS)技术的已发表报告数量正在增加。这些技术会生成大量数据,需要进行适当呈现并供其他研究人员进一步使用。我们的目标是创建一个包含与不同类型癌症相关的单核苷酸多态性(SNP)的综合数据库,以便将它们整合到我们的生物信息学工具中。我们查阅了200多篇科学论文,并提取了有关通过NGS技术检测到的突变的相关信息。该数据库的当前版本包含70多种癌症中的10多万个突变。然而,我们对NGS研究的综述发现,科学文献中NGS数据的呈现方式差异很大,几乎没有对数据格式进行标准化的努力。NGS结果以多种形式发表,这阻碍了信息的收集。因此,我们建议采用统一的格式来呈现NGS数据。这将加快数据库开发,便于实验室之间的访问和数据共享。该数据库将成为癌症研究领域许多研究人员的有用工具,并可为一系列研究(如全基因组关联研究、微小RNA靶标结合以及癌症生物标志物研究的开展)奠定基础。