Majaj Najib J, Pelli Denis G, Kurshan Peri, Palomares Melanie
Psychology and Neural Science, New York University, 6 Washington Place, New York, NY 10003, USA.
Vision Res. 2002 Apr;42(9):1165-84. doi: 10.1016/s0042-6989(02)00045-7.
How we see is today explained by physical optics and retinal transduction, followed by feature detection, in the cortex, by a bank of parallel independent spatial-frequency-selective channels. It is assumed that the observer uses whichever channels are best for the task at hand. Our current results demand a revision of this framework: Observers are not free to choose which channels they use. We used critical-band masking to characterize the channels mediating identification of broadband signals: letters in a wide range of fonts (Sloan, Bookman, Künstler, Yung), alphabets (Roman and Chinese), and sizes (0.1-55 degrees ). We also tested sinewave and squarewave gratings. Masking always revealed a single channel, 1.6+/-0.7 octaves wide, with a center frequency that depends on letter size and alphabet. We define an alphabet's stroke frequency as the average number of lines crossed by a slice through a letter, divided by the letter width. For sharp-edged (i.e. broadband) signals, we find that stroke frequency completely determines channel frequency, independent of alphabet, font, and size. Moreover, even though observers have multiple channels, they always use the same channel for the same signals, even after hundreds of trials, regardless of whether the noise is low-pass, high-pass, or all-pass. This shows that observers identify letters through a single channel that is selected bottom-up, by the signal, not top-down by the observer. We thought shape would be processed similarly at all sizes. Bandlimited signals conform more to this expectation than do broadband signals. Here, we characterize processing by channel frequency. For sinewave gratings, as expected, channel frequency equals sinewave frequency f(channel)=f. For bandpass-filtered letters, channel frequency is proportional to center frequency f(channel) proportional, variantf(center) (log-log slope 1) when size is varied and the band (c/letter) is fixed, but channel frequency is less than proportional to center frequency f(channel) proportional, variantf(center)(2/3) (log-log slope 2/3) when the band is varied and size is fixed. Finally, our main result, for sharp-edged (i.e. broadband) letters and squarewaves, channel frequency depends solely on stroke frequency, f(channel)/10c/deg=(2/3), with a log-log slope of 2/3. Thus, large letters (and coarse squarewaves) are identified by their edges; small letters (and fine squarewaves) are identified by their gross strokes.
我们如今对视觉的理解是,通过物理光学和视网膜转导,随后在皮层中由一组并行独立的空间频率选择性通道进行特征检测来解释的。假定观察者会选用最适合手头任务的通道。我们目前的研究结果要求对这一框架进行修正:观察者并不能自由选择他们所使用的通道。我们使用临界带宽掩蔽来表征介导宽带信号识别的通道:各种字体(斯隆体、书宋体、艺术体、幼圆体)、字母表(罗马字母和中文)以及大小(0.1 - 55度)的字母。我们还测试了正弦波和方波光栅。掩蔽始终揭示出一个单一通道,宽度为1.6 ± 0.7倍频程,其中心频率取决于字母大小和字母表。我们将一个字母表的笔画频率定义为通过一个字母的切片所穿过的线条平均数量除以字母宽度。对于边缘清晰(即宽带)的信号,我们发现笔画频率完全决定通道频率,与字母表、字体和大小无关。此外,即便观察者有多个通道,他们对于相同信号总是使用同一个通道,即使经过数百次试验,无论噪声是低通、高通还是全通。这表明观察者通过一个由信号自下而上选择的单一通道来识别字母,而非由观察者自上而下选择。我们原以为形状在所有大小下的处理方式会相似。带限信号比宽带信号更符合这一预期。在此,我们通过通道频率来表征处理过程。对于正弦波光栅,正如预期的那样,通道频率等于正弦波频率f(通道)=f。对于带通滤波的字母,当大小变化而带宽(c/字母)固定时,通道频率与中心频率成正比f(通道) ∝ f(中心)(对数 - 对数斜率为1),但当带宽变化而大小固定时,通道频率与中心频率的比例小于f(通道) ∝ f(中心)(2/3)(对数 - 对数斜率为2/3)。最后,我们关于边缘清晰(即宽带)字母和方波的主要研究结果是,通道频率仅取决于笔画频率,f(通道)/10c/度=(2/3),对数 - 对数斜率为2/3。因此,大写字母(和粗方波)通过其边缘来识别;小写字母(和细方波)通过其大致笔画来识别。