Li Xiang-Ru, Lu Yu, Zhou Jian-Ming, Wang Yong-Jun
School of Mathematical Sciences, South China Normal University, Guangzhou 510631, China.
Guang Pu Xue Yu Guang Pu Fen Xi. 2011 Sep;31(9):2582-5.
With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.
随着高质量电荷耦合器件(CCD)在天体光谱成像中的广泛应用以及许多大型巡天计划(如斯隆数字巡天(SDSS)、2度视场星系红移巡天(2dF)、光谱巡天望远镜(SST)、大天区多目标光纤光谱望远镜(LAMOST)计划和大型综合巡天望远镜(LSST)计划等)的实施,天体观测数据如倾盆大雨般涌入。因此,为了有效且充分地利用这些数据,对天体数据自动处理方法的研究势在必行。在当前工作中,我们研究了如何基于最近邻方法从光谱中识别星系和类星体。星系和类星体是河外天体,它们距离地球很远,并且它们的光谱通常受到各种噪声的污染。因此,在自动光谱分类中识别这两类光谱是一个典型问题。此外,所采用的最近邻方法是模式识别和数据挖掘中最典型、经典、成熟的算法之一,并且经常被用作开发新算法的基准。从实际适用性来看,结果表明最近邻方法(NN)的识别率与基于更复杂方法在文献中报道的最佳结果相当,并且NN的优势在于该方法无需训练,这在海量光谱数据处理的增量学习和并行计算中很有用。总之,这项工作的结果有助于研究星系和类星体光谱分类。