Montpellier SupAgro-INRA, UMR MISTEA 729, Bâtiment 29, 2 place Pierre Viala, 34060 Montpellier Cedex 2, France.
Stat Med. 2013 Apr 15;32(8):1376-82. doi: 10.1002/sim.5589. Epub 2012 Aug 30.
In this paper, we study an unsupervised clustering problem. The originality of this problem lies in the data, which consist of the positions of five separate X-ray beams on a circle. Radiation therapists positioned the five X-ray beam 'projectors' around each patient on a predefined circle. However, similarities exist in positioning for certain groups of patients, and we aim to describe these similarities with the goal of creating pre-adjustment settings that could help save time during X-ray positioning. We therefore performed unsupervised clustering of observed X-ray positions. Because the data for each patient consist of five angle measurements, Euclidean distances are not appropriated. Furthermore, we cannot perform k-means algorithm, usually used for minimizing corresponding distortion because we cannot calculate centers of clusters. We present here solutions to these problems. First, we define a suitable distance on the circle. Then, we adapt an algorithm based on simulated annealing to minimize distortion. This algorithm is shown to be theoretically convergent. Finally, we present simulations on simulated and real data.
在本文中,我们研究了一个无监督聚类问题。该问题的新颖之处在于所使用的数据,这些数据由五个独立的 X 射线束在一个圆上的位置组成。放射治疗师将五个 X 射线束“投影仪”放置在每个患者周围的预定义圆上。然而,对于某些组别的患者,存在着定位上的相似性,我们旨在描述这些相似性,以创建预调整设置,从而帮助在 X 射线定位时节省时间。因此,我们对观察到的 X 射线位置进行了无监督聚类。由于每个患者的数据由五个角度测量值组成,因此不适合使用欧几里得距离。此外,我们不能执行通常用于最小化对应失真的 k-均值算法,因为我们无法计算聚类中心。我们在这里提出了解决这些问题的方法。首先,我们在圆上定义了一个合适的距离。然后,我们采用基于模拟退火的算法来最小化失真。该算法在理论上被证明是收敛的。最后,我们展示了模拟和真实数据的模拟结果。