Description | This function initializes the multi-processor libraries. It takes the number of processor cores as argument and creates the appropriate number of additional worker threads, so that subsequent OptiVec functions can distribute their work-load over the available processor cores. nProcCores can presently take on values of 2 up to 128. When Intel, AMD or others shall announce the development of systems with even more processor cores, future versions of OptiVec shall make provisions for as many cores as may become available.
On computers with many processor cores (from about 8 or 16 on), one might wish to limit the number of threads which each called OptiVec function may claim for itself. This is especially true, if an application generates several worker threads, and if each (or several) of them calls OptiVec functions. In the interest of a well-balanced use of the processor capacities, call V_limitThreadsPerFunc to set the number of threads available to each call of an OptiVec function. Be, however, rather generous: For example, with 16 cores and 4 application threads, give the OptiVec functions not only 16/4=4, but rather 8 threads each. The OS will take care of optimum scheduling of all these threads.
V_initMT must be called once (and only once), before any of the OptiVec routines present in the multi-processor libraries can be executed. On the other hand, calling V_initMT, when you are using a general-purpose (non-threaded) OptiVec library, does not harm. V_initMT is present as an empty function also in these libraries. Calling it just does not have any effect. Thereby, switching back and forth between the various versions of OptiVec libraries for testing purposes is facilitated.
In case you use V_setFPAccuracy in order to modify the floating-point accuracy of the FPU, be sure that the call to V_initMT stands after the call to V_setFPAccuracy. The reason is that V_initMT creates the additional threads with the same FPU settings as found at the moment when the function is invoked.
nProcCores need not be your actual number of processor cores. For testing purposes, you can enter any of the legal values. Of course, optimum performance will be attained only for the correct value. For applications distributed to others, where you do not have control over the configuration of the systems they will ultimately run on, it is recommended to determine nProcCores either through user input or more elegantly through a system detection routine. Consult the processor documentations by Intel and AMD for details.
Although it is not absolutely necessary, it is recommended that you delete the extra threads by calling V_closeMT at the end of your programme.
V_initMT stores the number of processor cores in the public variable V_nProcCores which is defined as (extern "C") unsigned in C/C++, and as UInt in Delphi. Reading this variable, one can determine if V_initMT has already been called.
|