Johannesma P, Aertsen A D, Cranen B, Van Erning L
Hear Res. 1981 Nov;5(2-3):123-45. doi: 10.1016/0378-5955(81)90042-3.
Representation of simple stationary sounds can be given either in the temporal form by display of the waveform as function of time or in the spectral form by intensity and phase as function of frequency. For complex nonstationary sounds, e.g. animal vocalisations and human speech, a combined spectro-temporal representation is more directly associated with auditory perception. The well-known sonogram or dynamic power spectrum has a fixed spectro-temporal resolution and neglects phase relations of different spectral and temporal sound components. In this paper the complex spectro-temporal intensity density CoSTID) is presented as a coherent spectro-temporal image of a sound, based on the analytic signal representation. The CoSTID allows an arbitrary form of the spectro-temporal resolution and preserves phase relations of different sound components. Since the CoSTID is a complex function of two variables, it leads naturally to the use of colour images for the spectro-temporal representation of sound: the phonochrome. The phonochromes are shown for different technical and natural sounds. Applications of this technique for study of phonation and audition and for biomedical signal processing are indicated.
简单稳态声音的表示可以通过将波形显示为时间的函数以时间形式给出,或者通过将强度和相位显示为频率的函数以频谱形式给出。对于复杂的非稳态声音,例如动物叫声和人类语音,联合的频谱-时间表示与听觉感知更直接相关。著名的声谱图或动态功率谱具有固定的频谱-时间分辨率,并且忽略了不同频谱和时间声音成分的相位关系。在本文中,基于解析信号表示,提出了复杂的频谱-时间强度密度(CoSTID)作为声音的相干频谱-时间图像。CoSTID允许任意形式的频谱-时间分辨率,并保留不同声音成分的相位关系。由于CoSTID是两个变量的复函数,它自然地导致使用彩色图像来进行声音的频谱-时间表示:即声色图。展示了不同技术声音和自然声音的声色图。指出了该技术在发声和听觉研究以及生物医学信号处理中的应用。