Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India.
Amity Institute of Biotechnology, Amity University Uttar Pradesh, Noida, India.
Adv Protein Chem Struct Biol. 2021;127:161-216. doi: 10.1016/bs.apcsb.2021.02.003. Epub 2021 Mar 24.
With the tremendous developments in the fields of biological and medical technologies, huge amounts of data are generated in the form of genomic data, images in medical databases or as data on protein sequences, and so on. Analyzing this data through different tools sheds light on the particulars of the disease and our body's reactions to it, thus, aiding our understanding of the human health. Most useful of these tools is artificial intelligence and deep learning (DL). The artificially created neural networks in DL algorithms help extract viable data from the datasets, and further, to recognize patters in these complex datasets. Therefore, as a part of machine learning, DL helps us face all the various challenges that come forth during protein prediction, protein identification and their quantification. Proteomics is the study of such proteins, their structures, features, properties and so on. As a form of data science, Proteomics has helped us progress excellently in the field of genomics technologies. One of the major techniques used in proteomics studies is mass spectrometry (MS). However, MS is efficient with analysis of large datasets only with the added help of informatics approaches for data analysis and interpretation; these mainly include machine learning and deep learning algorithms. In this chapter, we will discuss in detail the applications of deep learning and various algorithms of machine learning in proteomics.
随着生物和医学技术领域的巨大发展,大量数据以基因组数据、医学数据库中的图像或以蛋白质序列等形式生成。通过不同的工具分析这些数据,可以揭示疾病的细节和我们身体对它的反应,从而帮助我们理解人类健康。这些工具中最有用的是人工智能和深度学习 (DL)。DL 算法中人工创建的神经网络有助于从数据集中提取可行的数据,并进一步识别这些复杂数据集中的模式。因此,作为机器学习的一部分,DL 帮助我们应对蛋白质预测、蛋白质识别及其定量过程中出现的各种挑战。蛋白质组学是研究这些蛋白质及其结构、特征、性质等的学科。作为数据科学的一种形式,蛋白质组学在基因组技术领域帮助我们取得了卓越的进展。蛋白质组学研究中使用的主要技术之一是质谱 (MS)。然而,只有在信息学方法的帮助下进行数据分析和解释,MS 才能有效地分析大型数据集;这些方法主要包括机器学习和深度学习算法。在本章中,我们将详细讨论深度学习和机器学习的各种算法在蛋白质组学中的应用。