D. Baruth, Dr.rer.nat

January 2011

Guaranteed Extended Precision

Registers of the x86ALU(Arithmetic Logic Unit) and the x87FPU(Floating Point Unit) of standard PCs allow calculations withExtended Precision, i.e up to 64 precise bits (ε ≈ 2^{-64}). However, some operations of even moderate complexity may result in severely degraded precision; e.g. by cancellations occurring inalternating series,linear optimizationproblems, etc.^{[1]}Hence, when a certain degree of precision is requested and has to be guaranteed, it may be necessary to operate with larger entities than the x86/x87 native registers. Special algorithms and data structures must be introduced and execution time of calculations may be prolonged by several orders of magnitude.To attempt maximum efficiency, it is necessary to know the error characteristics of particular algorithms for every given calculation. Using (only) the CPU's native registers and functions, we have developed a special computer graphics program that enables empirical studies of the behaviour of different analytical methods and algorithms for transcendental functions

^{[2]}. Hence, knowing the functional behaviour and related errors, we can enhance precision selectively where needed.To be able to guarantee extended precision, we introduce special data structures - so called

Virtual Floating Point Registers (VFPRs)- that allow efficient operations with up to 256 precise bits (ε ≈ 2^{-256}):

Virtual Floating Point Register Structure

VFPR STRUC Mantissa DD 8 DUP (?) ; 256-bit mantissa Exponent DD ? ; 32-bit exponent Sign DB ? ; 8-bit sign (0 or -1) Status DB ? ; 8-bit status flag NumElements DB ? ; number of actual dword elements (≤ 8) NumFree DB ? ; number of empty dword elements (≥ 0) HighElement DW ? ; word pointer to mantissa's high element LowElement DW ? ; word pointer to mantissa's low element Address DW ? ; VFPR data offset DW ? ; VFPR data segment VFPR ENDS

Example: Bessel Function Of The First KindFor demonstration purposes, we have chosen to map the analysis and evaluation of the

Bessel function of the first kind, for real arguments, which is one of the solutions to the differential equation^{[3]}and is represented by the ascending alternating series

z²w" + zw' + (z² - ν²)w = 0, z = x+iy = x^{[4]}In our eXtended Precision library "XP87.LIB", all the

VFPRs(and other data) are contained in one 64K segment ("VR_Data"). To maximize execution speed and ensure optimal usage of the CPU's cache memory, also the entire code is gathered in one 64K segment ("VR_Code"). The functions are accessible through far calls; either directly by an x86 assembler or through an interface to a high level x86 compiler, like FORTRAN, C, etc.To call the Bessel function

Jfrom within a generic assembly program (see "My_Program" below), first the function's order (index) and argument are loaded on the FPU stack. Then, the external (far) function _FBSL1 is called, returning the result on the FPU stack; the real part in ST(0) and the imaginary part in ST(1). In case of an error, the carry flag is set. Finally the result is popped into memory variables_{ν}(x)ReJandImJ. In the following example it is assumed that all variables are real, defined in a DATA segment, pointed to by theDSregister.

Generic Assembly Code For Function Call

My_Program proc . . . fld Order_v ; ST(0) ← ν fld Arg_x ; ST(0) ← x, ST(1) = ν call _FBSL1 ; ST(0) = Re[J _{ν}(x)], ST(1) = Im[J_{ν}(x)]fstp ReJ ; ReJ ← ST(0), ST(0) ← ST(1) fstp ImJ ; ImJ ← ST(0) . . . My_Program endp To serve as timing benchmarks, the table below lists the evaluation time - in microseconds (10

^{-6}sec) - for different positive argumentsxand ordersν; using an AMD Athlon 64, 4000+, 2.6 GHz with a classic MS-DOS operating system.

Bessel Function 1st Kind - Evaluation Times [μs]

x / ν 0 1 5 10 50 100 250 500 1000 0.1 60 65 70 70 63 53 58 94 162 1 81 92 95 92 76 55 59 94 165 5 134 140 144 141 113 76 67 102 173 10 177 190 180 177 138 91 70 105 175 50 404 419 409 396 298 177 101 128 193 100 650 663 651 629 488 283 142 152 212 250 100 100 102 106 161 262 865 617 376 500 95 101 102 115 166 217 848 1663 1137 1000 79 98 99 85 121 163 518 2054 2572

References

1 High-Precision Computation and Mathematical Physics, David H. Bailey and Jonathan M. Borwein, July 6, 2009; http://crd.lbl.gov/~dhbailey/dhbpapers/dhb-jmb-acat08.pdf 2 Graphical Error Analysis, D. Baruth, 2006; http://iging.com/Transcendental/J_100_95.htm 3 Handbook of Mathematical Functions, Abramowitz & Stegun, Dover Publications, fifth printing, p 358, 9.1.1 4 HMF, Abramowitz & Stegun, p 360, 9.1.10