IGING.COM
X86/7 Low Level Software Development

D. Baruth, Dr.rer.nat

January 2011



Guaranteed Extended Precision


Registers of the x86 ALU (Arithmetic Logic Unit) and the x87 FPU (Floating Point Unit) of standard PCs allow calculations with Extended Precision, i.e up to 64 precise bits (ε ≈ 2-64).  However, some operations of even moderate complexity may result in severely degraded precision; e.g. by cancellations occurring in alternating series, linear optimization problems, etc.[1]  Hence, when a certain degree of precision is requested and has to be guaranteed, it may be necessary to operate with larger entities than the x86/x87 native registers.  Special algorithms and data structures must be introduced and execution time of calculations may be prolonged by several orders of magnitude.

To attempt maximum efficiency, it is necessary to know the error characteristics of particular algorithms for every given calculation.  Using (only) the CPU's native registers and functions, we have developed a special computer graphics program that enables empirical studies of the behaviour of different analytical methods and algorithms for transcendental functions[2].   Hence, knowing the functional behaviour and related errors, we can enhance precision selectively where needed.

To be able to guarantee extended precision, we introduce special data structures - so called Virtual Floating Point Registers (VFPRs) - that allow efficient operations with up to 256 precise bits (ε ≈ 2-256):

Virtual Floating Point Register Structure

VFPR STRUC
Mantissa DD 8  DUP (?) ; 256-bit mantissa
Exponent DD ? ; 32-bit exponent
Sign DB ? ; 8-bit sign (0 or -1)
Status DB ? ; 8-bit status flag
NumElements DB ? ; number of actual dword elements (≤ 8)
NumFree DB ? ; number of empty dword elements (≥ 0)
HighElement DW ? ; word pointer to mantissa's high element
LowElement DW ? ; word pointer to mantissa's low element
Address DW ? ; VFPR data offset
DW ? ; VFPR data segment
VFPR ENDS

Example: Bessel Function Of The First Kind

For demonstration purposes, we have chosen to map the analysis and evaluation of the Bessel function of the first kind, for real arguments, which is one of the solutions to the differential equation[3]

             zw" + zw' + (z - ν)w = 0,    z = x+iy = x

and is represented by the ascending alternating series[4]
Bessel function of the
	  first kind

In our eXtended Precision library "XP87.LIB", all the VFPRs (and other data) are contained in one 64K segment ("VR_Data").  To maximize execution speed and ensure optimal usage of the CPU's cache memory, also the entire code is gathered in one 64K segment ("VR_Code").  The functions are accessible through far calls; either directly by an x86 assembler or through an interface to a high level x86 compiler, like FORTRAN, C, etc.

To call the Bessel function Jν(x) from within a generic assembly program (see "My_Program" below), first the function's order (index) and argument are loaded on the FPU stack.  Then, the external (far) function _FBSL1 is called, returning the result on the FPU stack; the real part in ST(0) and the imaginary part in ST(1).  In case of an error, the carry flag is set.  Finally the result is popped into memory variables ReJ and ImJ.  In the following example it is assumed that all variables are real, defined in a DATA segment, pointed to by the DS register.

Generic Assembly Code For Function Call

My_Program proc
. . .
fld Order_v ; ST(0) ← ν
fld Arg_x ; ST(0) ← x, ST(1) = ν
call _FBSL1 ; ST(0) = Re[Jν(x)], ST(1) = Im[Jν(x)]
fstp ReJ ; ReJ ← ST(0), ST(0) ← ST(1)
fstp ImJ ; ImJ ← ST(0)
. . .
My_Program endp

To serve as timing benchmarks, the table below lists the evaluation time - in microseconds (10-6 sec) - for different positive arguments x and orders ν; using an AMD Athlon 64, 4000+, 2.6 GHz with a classic MS-DOS operating system.

Bessel Function 1st Kind - Evaluation Times [μs]

x / ν
0
1
5
10
50
100
250
500
1000
0.1
60
65
70
70
63
53
58
94
162
1
81
92
95
92
76
55
59
94
165
5
134
140
144
141
113
76
67
102
173
10
177
190
180
177
138
91
70
105
175
50
404
419
409
396
298
177
101
128
193
100
650
663
651
629
488
283
142
152
212
250
100
100
102
106
161
262
865
617
376
500
95
101
102
115
166
217
848
1663
1137
1000
79
98
99
85
121
163
518
2054
2572


References

1     High-Precision Computation and Mathematical Physics, David H. Bailey and Jonathan M. Borwein, July 6, 2009; http://crd.lbl.gov/~dhbailey/dhbpapers/dhb-jmb-acat08.pdf
2 Graphical Error Analysis, D. Baruth, 2006; http://iging.com/Transcendental/J_100_95.htm
3 Handbook of Mathematical Functions, Abramowitz & Stegun, Dover Publications, fifth printing, p 358, 9.1.1
4 HMF, Abramowitz & Stegun, p 360, 9.1.10


  © Copyright D.Baruth, IGING.COM 2011.  All rights reserved.