Empirical Auto-tuning Code Generator for FFT and Trigonometric Transforms
2007 (English)Conference paper (Refereed)
We present an automatic, empirically tuned code genenrator for Real/Complex FFT and Trigonometric Transforms. The code generator is part of an adaptive and portable FFT computation framework - UHFFT. Performance portability over varying architectures is achieved by generating highly optimized set of straight line C codelets (micro-kernel) that adapt to the microprocessor architecture. The tuning is performed by generating several variants of same size codelet with different combinations of optimization parameters. The variants are iteratively compiled and evaluated to find the best implementation on a given platform. Apart from minimizing the operation count, the code generator optimizes for access pattern, register blocking, instruction schedule and structure of arrays. We present details of the optimizations conducted at several stages and the performance gain at each of those levels. We conclude the paper with discussion of the overall performance improvements due to this aggressive approach to generating optimized FFT kernels.
Place, publisher, year, edition, pages
Computer and Information Science
IdentifiersURN: urn:nbn:se:kth:diva-63149OAI: oai:DiVA.org:kth-63149DiVA: diva2:481701
5th Workshop on Optimizations for DSP and Embedded Systems, 2007 International Symposium on Code Generation and Optimization. San Jose, CA, USA. March 11, 2007
QC 201205242012-01-222012-01-222012-05-24Bibliographically approved