Performance and accuracy benchmarking for FFT (Fast Fourier Transform), RFFT(real valued fft) and FHT (Fast Hartley Transform) algorithms.

Source code for many of the FFT's and FHT's described below.

Under construction

This page is currently under construction as I'm trying to bring the information more up to date. If you don't want to struggle through this incomplete page, you can see my 1993 postingthat summarized the performance and accuracy of common FFTs at that time.

As shown on Matteo Frigo and Steven Johnson's excellent FFTW page, today, for complex-valued fft's there are a number of algorithms that beat this one (except for the kinda special case of N=2^11 on a power mac;-)).
Their page has a much more comprehensive survey of various general-purpose-processors and FFT algorithms designed for them; so I'll just touch on them here and try to focus on stuff they haven't put on the net yet.

For a general purpose systems(PC's and workstations) and general applications I would recommend FFTW .
For embedded systems with severe memory(no room for trig tables) and/or processing(no trig functions) constraints this code may be better.
Obviously for DSPs, use the (almost certainly assembly language) FFT recommended by the DSP manufacturer. For example, Butterfly Dsp has a benchmark of DSP FFTs; and Texas Instruments Has some interesting Parallel DSP FFTS, and a cool VLIW architecture that can do a 1024 point FFT in 66usec ! Motorola's 56000 family can do the inner loop of a radix2 fft with 6 instructions; and Zorans needs only 4. Not even FFTW will beat those.
Fast Hartley Transforms are another area without too much press out on the net; and since my fft was based on one, I figured I might write something up about them here when I find time.
And Trig value generation is an often overlooked area of FFT programs, which can have drastic impacts on both the speed and accuracy of the results.

Performance summary

[under construction - units and processor info (some early 90's sparc) will be included when I look them up.]

-----------------
            Description of FFT algorithms compared.

numrec
    A straightforward radix-2 FFT
    Source: "Numerical Recipies in C";

duhamel
    "A Duhamel-Hollman split-radix dif fft"
     by: Dave Edelblute, edelblut@cod.nosc.mil, 05 Jan 1993 [...]
     Ref: Electronics Letters, Jan. 5, 1984"
    Source: periodic posting on comp.benchmarks

fourea
    "adapted from subroutine FOUREA listed in Programs for Digital
     Signal Processing [...] edited by Digital Signal Processing
     Committee [...] IEEE Acoustics Speech and Signal Processing
     Committee [...] Chapter 1 Section 1.1 Page 1.1-4,5"
    Source: comp.dsp 02/04/93; by pe93009@black.ox.ac.uk (Ted Wong)

singleton
    "Singleton's mixed radix algorithm"
    Source: From edu/math/msdos/modelling on wuarchive.wustl.edu
            Translated to C by: Javier Soley, FJSOLEY@UCRVM2.BITNET

sorensen
    my C translation of the code Henrik Sorensen posted here.
    Source: "Real-valued fast Fourier transform algorithms", IEEE
            Tran. ASSP, Vol. ASSP-35, No. 6, pp. 849-864, June 1987
            (My C translation included in the following posting)

mayer-*
    All my routines are wrappers around a fast Hartley transform.
    Source: the following posting.

---------------------------------------------------------------------------

Trig algorithms

A few people asked me via email about the trig algorithm I used: I'll try to look up the reference to the trig algorithm I was using; but I think I lost it somewhere in storage. It works by manipulating a log2(n) sized table of trig values. Each newly generated value is for an angle which lies exactly halfway between two angles whose trig values are known. I think the identity is something like

    sin( (w+v)/2 ) = (sin(w) + sin(v)) * sec(v-w) / 2

although I might be wrong about the argument inside the sec() function. You can start with a table of cos(pi),cos(pi/2),cos(pi/4), cos(pi/8)...., and everytime you generate a new value you can overwrite one which you no longer need (to avoid excessive memory usage). This takes less floating point math than any of the other trig generating algorithms I know of; but requires a bunch of integer math to determine which value can be overwritten. It is probably only faster on systems which emulate floating point math; but it is considerably more stable than other common methods.

I'll try to find and mention the reference in a future posting.

--------------------------------------------------------------------------- Accuracy and Noise of FFT algorithms.

An earlier discussion I had on comp.benchmarks people were discussing possible tradeoffs between speed and accuracy. The amount of noise generated by various FFT algorithms varies by a factor of about 1e10! However, as shown in the table below, this is not all that closely correlated to the speed of the algorithm.

In reality, noise added by any of these FFTs probably only matters if you are doing zillions of forward and reverse transforms; but some will argue that a good FFT should have very low noise.

One significant component of errors in FFTs is the trig algorithm used. Avoiding recursively successive calculating trig functions can reduce accumulated errors by about a factor of 5. Unfortunatelly C library functions are too slow for this purpose. See my code in the next posting for a number of compromise trig algorithms.

 algo        trig      N       time      ssq-errors
fourea-real ;c_lib    65536 :  3.48      2.19e-12
fourea      ;c_lib    65536 :  3.45      2.64e-11
duhamel     ;c_lib    65536 :  2.53      4.63e-21
numrec      ;numrec   65536 :  2.35      9.13e-14
mayer-rad4  ;c_lib    65536 :  2.08      1.79e-21   *(all library functions)
sing        ;sing     65536 :  1.68      5.22e-19
mayer-rad4  ;Buneman2 65536 :  1.52      2.46e-21   *(half-angle algorithm)
mayer-rad4  ;Simple   65536 :  1.51      8.16e-21   *(recursively generated)
nrec_real   ;numrec   65536 :  1.04      5.39e-19           
sing_real   ;sing     65536 :  0.987     ????????
sorensen-r  ;n-rec    65536 :  0.855     1.83e-20
mayer-real  ;Buneman2 65536 :  0.765     1.78e-21
mayer-fht   ;Buneman2 65536 :  0.707     2.46e-21

Note that the wierd trig algorithm reduces errors by almost a factor of 4 with negligable performance cost.

----------------------------------------------------------------------------
Ron Mayer
rmayer@digitalmarket.com

Source code for FFT's and FHT's described above.
Various alternatives for FHT and FFT algorithms, including those benchmarked on FFTW's page.

This page hosted byGeoCities Get your ownFree Home Page