VF_FFT  VD_FFT  VE_FFT 
VFb_FFT  VDb_FFT  VEb_FFT 
VF_FFTtoC  VD_FFTtoC  VE_FFTtoC 
VFb_FFTtoC  VDb_FFTtoC  VEb_FFTtoC 
VCF_FFT  VCD_FFT  VCE_FFT 
VCFb_FFT  VCDb_FFT  VCEb_FFT 

Function  Fast Fourier transform 

Syntax C/C++  #include <VFstd.h>
void VF_FFT( fVector Y, fVector X, ui size, int dir );
void VCF_FFT( cfVector Y, cfVector X, ui size, int dir );
void VF_FFTtoC( cfVector Y, fVector X, ui size );
void VFb_FFT( fVector Y, fVector X, ui size, int dir; fVector Buf );
void VCFb_FFT( cfVector Y, cfVector X, ui size, int dir; cfVector Buf );
void VFb_FFTtoC( cfVector Y, fVector X, ui size; cfVector Buf ); 
C++ VecObj  #include <OptiVec.h>
void vector<T>::FFT( const vector<T>& X, int dir=1 );
void vector<complex<T>>::FFT( const vector<complex<T>>& X, int dir=1 );
void vector<complex<T>>::FFTtoC( const vector<T>& X );
void vector<T>::b_FFT( const vector<T>& X, int dir, vector<T>& Buf );
void vector<complex<T>>::b_FFT( const vector<complex<T>>& X, int dir, vector<complex<T>>& Buf );
void vector<complex<T>>::b_FFTtoC( const vector<T>& X, vector<complex<T>>& Buf ); 
Pascal/Delphi  uses VFstd;
procedure VF_FFT( Y, X:fVector; size:UIntSize; dir:Integer );
procedure VCF_FFT( Y, X:cfVector; size:UIntSize; dir:Integer );
procedure VF_FFTtoC( Y:cfVector; X:fVector; size:UIntSize );
procedure VFb_FFT( Y, X:fVector; size:UIntSize; dir:Integer; Buf: fVector );
procedure VCFb_FFT( Y, X:cfVector; size:UIntSize; dir:Integer; Buf: cfVector );
procedure VFb_FFTtoC( Y:cfVector; X:fVector; size:UIntSize; Buf: cfVector ); 

CUDA function C/C++  #include <cudaVFstd.h>
int cudaVF_FFT( fVector d_Y, fVector d_X, ui size, int dir );
int cudaVCF_FFT( cfVector d_Y, cfVector d_X, ui size, int dir );
int cudaVF_FFTtoC( cfVector d_Y, fVector d_X, ui size );
void VFcu_FFT( fVector h_Y, fVector h_X, ui size, int dir );
void VCFcu_FFT( cfVector h_Y, cfVector h_X, ui size, int dir );
void VFcu_FFTtoC( cfVector h_Y, fVector h_X, ui size );

CUDA function Pascal/Delphi  uses VFstd, VCFstd;
function cudaVF_FFT( d_Y, d_X:fVector; size:UIntSize; dir:Integer ): IntBool;
function cudaVCF_FFT( d_Y, d_X:cfVector; size:UIntSize; dir:Integer ): IntBool;
function cudaVF_FFTtoC( d_Y:cfVector; d_X:fVector; size:UIntSize ): IntBool;
procedure VFcu_FFT( h_Y, h_X:fVector; size:UIntSize; dir:Integer );
procedure VCFcu_FFT( h_Y, h_X:cfVector; size:UIntSize; dir:Integer );
procedure VFcu_FFTtoC( h_Y:cfVector; h_X:fVector; size:UIntSize );


Description  The Fourier transform of X is calculated and stored in Y. A Fast Fourier Transform algorithm is used that requires size to be a power of 2. The forward transform is obtained by setting dir = 1, the inverse (or backward) transform by setting dir = −1. By convention, the inverse transform involves scaling the result by the factor 1.0/size (so as to ensure that the resul of one forward and one backward transform yields – within roundoff error – the original vector). Since it is sometimes desirable to skip this implicit scaling, VF_FFT offers the possibility to do so: specify dir = 2 in this case.
Complex version: Both X and the output Y are complex vectors.
Realtocomplex version: The input vector X is real. The output vector is complex. As this function can only perform a forward transform, no argument "dir" is needed.
Purely real version: For the forward transform, X is a real vector. The output Y is also defined as fVector, although it consists of complex numbers. These are packed in a special way such as to fit into the same amount of memory as the original real vector X. The order of storage in Y is indicated in the following table (N=size, U is the uncompressed Fourier Transform of X):
Y_{0}  Y_{1}  Y_{2}
 Y_{3}  .....
 Y_{N2}  Y_{N−1} 
U_{0}.Re  U_{N/2}.Re  U_{1}.Re
 U_{1}.Im  .....
 U_{N/2−1}.Re  U_{N/2−1}.Im 
The reason for this packing is the following. If the size real data points of X represent a function of the time, X = g(t), then the forward transform yields a function U = G(f) in the frequency domain. In principle, U consists of size+1 complex data points: size/2 points for positive frequencies, another size/2 points for negative frequencies, and one point at frequency zero.
For the Fourier Transform of a real vector, the symmetry relation G(f) = G(f)^{*} holds (the asterisc denoting the complex conjugate). This means that the points at negative frequencies need not be stored; all information is already contained in the positive frequency half. Moreover, the zeroth and the size2'th element of the transform are both purely real. Therefore, only these two real and size/2−1 complex data points have to be stored  which exactly fit into the same amount of memory as the original size real data points of X. This allows X to be overwritten by its transform, if desired.
For the real version of the inverse transform, X has to be a complex vector packed in the way just described, and a realvalued vector Y is obtained.
For the Fourier Transform of several vectors of equal size, we recommend to copy the individual vectors into the rows (C/C++) or the columns (Pascal/Delphi) of a matrix and to call MF_Rows_FFT or MF_Cols_FFT, respectively. Depending on the size of the individual vectors, this way is more efficient than the individual processing already from about 4 vectors on.
VFb_FFT, VFb_FFTtoC and VCFb_FFT take a buffer vector Buf as an additional argument. They are slightly more efficient than the unbuffered versions. Buf must have (at least) the same size as X and Y. Additionally, Buf must be 128bit (P8) or 256bit (P9) aligned. This usually means you can only take a vector allocated by the VF_vector family as Buf. 

Error handling  If size is not a power of 2, an error message "Size must be an integer power of 2" is generated and the program aborted. 


