We set out to design and implement a hardware-based computational electromagnetic (CEM) accelerator that would be able to replace large PC clusters with a single desktop workstation. To accomplish this, we leveraged our Celerity™ acceleration card and combined it with a host PC and a CAD interface. The front-end software sends the appropriate data, such as the mesh size and the number of timesteps to execute, to our Celerity™ board via the PCI bus. The FDTD accelerator proceeds to update the fields, periodically sending the results back to the host computer for post-processing and visualization. The accelerator board itself supports up to 16 GB of DDR SDRAM, 36 Mb of DDR SRAM, a Xilinx Virtex-II 8000 FPGA, and a PLX 9656 external PCI controller. Our Celerity™ platform easily surpassed the performance of 50-node PC clusters.
CelerityTM Accelerator Card
The Speckle imaging algorithm, developed at Lawrence Livermore National Laboratory, is designed to compensate for atmospheric disturbances that arise in long-range imaging applications. In order to do this, the algorithm combines information from several images, taken a short time apart from one another. These can be a series of multiple short-exposure still shots from a conventional camera or, more commonly, a sequence of consecutive video frames. This information is processed in the frequency domain, where magnitude and phase are “averaged” independently and subsequently recombined in the real space. As a result, the algorithm produces a single corrected image with quality near the diffraction limit.
Image Enhancement Using the Speckle Algorithm
Qualitatively, the complexity of speckle processing can be attributed to two main factors: the high computation rate and the large memory requirements. The computational rate is a direct consequence of the large number of pixels in the image, which must be transformed into the frequency domain (FFT) and then to the bispectral domain. These transformations account for the majority of the computational time. The other major factor in the performance of this algorithm is memory access. Large amounts of memory are required to keep a bispectral buffer used as a “sliding window” for computing the bispectral average. Software implementations usually sacrifice computational performance in order to decrease the memory requirements, thus permitting the data set to fit in PC memory.
The acceleration of the speckle algorithm is an ongoing process. The ultimate goal of this project is the development of a real-time compensation engine capable of processing high-definition 720p signals at 60 frames per second. To date, we have accelerated the software implementation by 40x.