High performance 3D-FFT implementation
2013 (English)In: Circuits and Systems (ISCAS), 2013 IEEE International Symposium on, IEEE , 2013, 2227-2230 p.Conference paper (Refereed)
3D FFT is a very data and compute intensive kernel encountered in many applications. We report a high performance design and implementation of 3D-FFT on a CGRA which supports partial reconfiguration. The hardware software multi clock design uses dynamic reconfiguration to reduce the required communication bandwidth to achieve a sustained throughput of 40 GOPS on a wordsize of 48 bits. Performance metrics including overheads and speed over software for implementations of up to 256 point 3D-FFT have been presented in the paper.
Place, publisher, year, edition, pages
IEEE , 2013. 2227-2230 p.
, IEEE International Symposium on Circuits and Systems, ISSN 0271-4310
Clock design, Communication bandwidth, Dynamic re-configuration, High-performance design, Partial reconfiguration, Performance metrics
Other Electrical Engineering, Electronic Engineering, Information Engineering
IdentifiersURN: urn:nbn:se:kth:diva-132297DOI: 10.1109/ISCAS.2013.6572319ScopusID: 2-s2.0-84883372638ISBN: 978-146735760-9OAI: oai:DiVA.org:kth-132297DiVA: diva2:659485
2013 IEEE International Symposium on Circuits and Systems, ISCAS 2013; Beijing; China; 19 May 2013 through 23 May 2013
QC 201311052013-10-252013-10-252013-11-05Bibliographically approved