Department of Mathematics and Computer Science, University of Calabria, Rende, Italy.
Methods Mol Biol. 2022;2401:217-237. doi: 10.1007/978-1-0716-1839-4_14.
The aim in microarray data analysis is to discover patterns of gene expression and to identify similar genes. Simply comparing new gene sequences to known DNA sequences often does not reveal the function of a new gene; thus, more sophisticated techniques are in order. Nowadays, data mining techniques, and in particular the clustering process, play an important role in bioinformatics. To analyze vast amounts of data can be difficult; thus, a way to cluster similar data is needed. This chapter is devoted to illustrate the general data mining approach used in microarray data analysis, combining clustering, alignment and similarity, and to highlight a novel similarity measure capable of capturing hidden correlations between data.
微阵列数据分析的目的是发现基因表达模式并识别相似的基因。简单地将新的基因序列与已知的 DNA 序列进行比较通常并不能揭示新基因的功能;因此,需要更复杂的技术。如今,数据挖掘技术,特别是聚类过程,在生物信息学中起着重要作用。分析大量数据可能很困难;因此,需要一种聚类相似数据的方法。本章致力于说明微阵列数据分析中使用的一般数据挖掘方法,结合聚类、对齐和相似性,并突出一种新的相似性度量方法,能够捕捉数据之间隐藏的相关性。