Bi Wenyan, Jin Peiran, Nienborg Hendrikje, Xiao Bei
Department of Computer Science, American University, Washington, DC, USA.
Department of Physics, Georgetown University, Washington, DC, USA.
J Vis. 2018 May 1;18(5):12. doi: 10.1167/18.5.12.
Humans can visually estimate the mechanical properties of deformable objects (e.g., cloth stiffness). While much of the recent work on material perception has focused on static image cues (e.g., textures and shape), little is known about whether humans can integrate information over time to make a judgment. Here we investigated the effect of spatiotemporal information across multiple frames (multiframe motion) on estimating the bending stiffness of cloth. Using high-fidelity cloth animations, we first examined how the perceived bending stiffness changed as a function of the physical bending stiffness defined in the simulation model. Using maximum-likelihood difference-scaling methods, we found that the perceived stiffness and physical bending stiffness were highly correlated. A second experiment in which we scrambled the frame sequences diminished this correlation. This suggests that multiframe motion plays an important role. To provide further evidence for this finding, we extracted dense motion trajectories from the videos across 15 consecutive frames and used the trajectory descriptors to train a machine-learning model with the measured perceptual scales. The model can predict human perceptual scales in new videos with varied winds, optical properties of cloth, and scene setups. When the correct multiframe was removed (using either scrambled videos or two-frame optical flow to train the model), the predictions significantly worsened. Our findings demonstrate that multiframe motion information is important for both humans and machines to estimate the mechanical properties. In addition, we show that dense motion trajectories are effective features to build a successful automatic cloth-estimation system.
人类能够通过视觉估计可变形物体的机械性能(例如,布料的硬度)。虽然最近关于材料感知的许多研究都集中在静态图像线索(例如,纹理和形状)上,但对于人类是否能够整合时间信息来做出判断却知之甚少。在这里,我们研究了跨多个帧的时空信息(多帧运动)对估计布料弯曲刚度的影响。使用高保真布料动画,我们首先研究了感知到的弯曲刚度如何随模拟模型中定义的物理弯曲刚度而变化。使用最大似然差异缩放方法,我们发现感知到的刚度与物理弯曲刚度高度相关。在第二个实验中,我们打乱了帧序列,这种相关性降低了。这表明多帧运动起着重要作用。为了为这一发现提供进一步的证据,我们从连续15帧的视频中提取了密集的运动轨迹,并使用轨迹描述符来训练一个带有测量感知尺度的机器学习模型。该模型可以预测在具有不同风、布料光学特性和场景设置的新视频中的人类感知尺度。当去除正确的多帧(使用打乱的视频或两帧光流来训练模型)时,预测结果会显著变差。我们的研究结果表明,多帧运动信息对于人类和机器估计机械性能都很重要。此外,我们表明密集的运动轨迹是构建成功的自动布料估计系统的有效特征。