Mangul Serghei
Department of Computer Science, University of California Los Angeles, Los Angeles, CA, U.S.A.
Institute for Quantitative and Computational Biosciences, University of California Los Angeles, Los Angeles, CA, U.S.A.
Emerg Top Life Sci. 2019 Aug 16;3(4):335-341. doi: 10.1042/ETLS20180175.
Recent advances in omics technologies have led to the broad applicability of computational techniques across various domains of life science and medical research. These technologies provide an unprecedented opportunity to collect the omics data from hundreds of thousands of individuals and to study the gene-disease association without the aid of prior assumptions about the trait biology. Despite the many advantages of modern omics technologies, interpretations of big data produced by such technologies require advanced computational algorithms. I outline key challenges that biomedical researches are facing when interpreting and integrating big omics data. I discuss the reproducibility aspect of big data analysis in the life sciences and review current practices in reproducible research. Finally, I explain the skills that biomedical researchers need to acquire to independently analyze big omics data.
组学技术的最新进展已使计算技术在生命科学和医学研究的各个领域得到广泛应用。这些技术提供了前所未有的机会,可从数十万个体中收集组学数据,并在无需借助对性状生物学的先验假设的情况下研究基因与疾病的关联。尽管现代组学技术具有诸多优势,但对此类技术产生的大数据进行解读需要先进的计算算法。我概述了生物医学研究在解读和整合大型组学数据时所面临的关键挑战。我讨论了生命科学中大数据分析的可重复性方面,并回顾了可重复性研究的当前实践。最后,我解释了生物医学研究人员为独立分析大型组学数据而需要掌握的技能。