Personal tools
You are here: Home Publications An Adaptive Software Library for Fast Fourier Transforms on Real Input Data
Document Actions

Fredrick Mwandia (2000)

An Adaptive Software Library for Fast Fourier Transforms on Real Input Data

Master thesis, Computer Science, University of Houston, 4800 Calhoun Road, Houston, TX 77204.

In this thesis, we present an efficient methodology for architecture adaptable generation of code that can be constructed of components with well defined algebraic rules of composition. The code generator is written in C and generates a library of codelets in C. The code generator is shown to be flexible and extensible and the entire library can be generated in a matter of seconds. We also present a software library for the Fast Fourier Transform (FFT) on real input data and show the results of adaptivity to different hardware architectures. The library consists of a number of composable blocks of code called codelets, each computing a part of the transform. The actual FFT algorithm used by the code is determined at run-time by selecting the fastest strategy among all possible strategies, given available codelets, for a given transform size and the distribution of data. The library takes into account the symmetrics that occur when real input data is used and it takes advantage of these symmetries to produce more efficient codelets. We have evaluated the library for performance on the IBM-Power3, SGI Origin-2000, and the Intel Pentium-III systems. The library is shown to be portable, adaptive and flexible. The optimization of the FFT library is performed on two levels. The low level optimization involves generation of highly efficient codelets for small transform sizes. These codelets are optimized for the number of arithmetic operations and memory accesses on each platform and are shown to perform well on all architectures. The high level optimization involves selection of an execution plan for different transform sizes. Large size tranforms are created by using a computationally efficient combination of the codelets library. Performance data for both cases is presented.

by admin last modified 2007-12-10 21:05
« September 2010 »
Su Mo Tu We Th Fr Sa

Powered by Plone

LACSI Collaborators include: