Rao A Ravishankar, Cecchi Guillermo A, Magnasco Marcelo
IBM T,J, Watson Research Center, Yorktown Heights, NY 10598, USA.
BMC Cell Biol. 2007 Jul 10;8 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2121-8-S1-S9.
The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications.
We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478x speedup.
Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.
由于数据集规模巨大(达数GB)且周转时间要求快,处理通过显微镜获取的图像是一项具有挑战性的任务。如果图像处理阶段的吞吐量能显著提高,那么它会对显微镜应用产生重大影响。
我们针对此问题提出了一种高性能计算(HPC)解决方案。这涉及将空间3D图像分解为片段,这些片段被分配给不同的处理器,并与IBM Blue Gene/L机器的3D环形架构相匹配。片段之间的通信仅限于最近邻。在2 GHz英特尔CPU上运行时,对典型的256兆字节数据集进行3D中值滤波任务需要两个半小时,而使用1024个Blue Gene节点,该任务可在18.8秒内完成,加速比达478倍。
我们的并行解决方案显著提高了图像处理、特征提取和3D重建任务的性能。吞吐量的提高使生物学家能够利用海量数据集进行前所未有的大规模实验。