UTK - Math 371 - Alexiades
Errors - Finite Precision - Roundoff
errors/approximations are everywhere in problem formulation, data, algorithm, programming, arithmetic, ...
if X is an approximation to x:
absolute error = |X − x| ,
relative error = |X − x|/|x|
( = | 1 − X/x | )
approximation correct to k significant (decimal) digits
|error| ≤ ½ 10−k
Only finitely many floating point numbers
exist on any computer ( and NaN and Inf ).
They are NOT evenly spaced (same # between consecutive powers of 2)
so spread out further and further apart.
Any given real number x must be approximated by a nearest
machine number, fl(x), resulting in roundoff error.
There is a smallest, s, and a largest, L, positive machine number;
|x| < s underflows to 0, |x| > L overflows to Inf
1. There is an inherent error in representing real numbers,
2. Loss of significant digits in arithmetic may occur:
• unbalanced addition:
when adding a very small to a very large real number
• subtractive cancellation:
when subtracting nearly equal reals
• unbalanced multiplication:
when nultiplying by a very small number
(or dividing by a very large number).
3. Machine arithmetic is not necessarily commutative,
i.e. the order of operations may affect the result!
4. Number of decimal digits that can be trusted:
up to 7 in single precision, up to 15 in double precision.
5. Should NEVER ask for equality of computed real numbers,
use if( abs(a − b) ≤ TOL ) , for some TOLerance
( with TOL no smaller than 1.e-15, for double precision )
Machine epsilon εmach
εmach is the smallest positive number such that 1+εmach > 1
εmach = 2−p , p=precision = # of bits in mantissa.
In double precision, p=53, so εmach = 2−53 ≈ 10−15
For any real x, fl(x) = x(1 + ε),
so ε = (fl(x)−x)/x
= relative error in representing x by fl(x),
with ε ≤ εmach
Thus, the absolute error in representing x by fl(x)
is roundoff = fl(x)−x = εx ,
it is proportional to x ,
so the larger the number the larger the roundoff error!
In Matlab: eps(x) gives the machine number nearest to x
for that computer.
eps(1) or just eps gives εmach itself.
For example: eps(1) = 2.2204e-16 , eps(100) = 1.4211e-14 ,
eps(1000) = 1.1369e-13 , eps(1.e6) = 1.1642e-10 , ...
losing more and more accuracy...
For example: 1+eps == 1 gives false,
1+eps/2 == 1 gives true
So, any (positive) number less than eps is effectively zero
Some strategies for effective computing:
minimize number of operations
use double precision computations
guard against loss of digits,
(subtraction of nearly equal numbers, etc)
Try to re-write the expression somehow...
enter very big and very small numbers in scientific notation:
e.g. 3.14e6 , 3.14e-6
print very big and very small numbers with '%e' formating.
use integer exponents in powers ( * for low powers: x*x, x*x*x )
do not round input, enter all digits known to be correct
evaluate polynomials in nested form
(saves operations and may reduce roundoff):
e.g. ax2 + bx + c = c + x*(b+a*x) ,
e.g. ax3 + bx2 + cx + d = d + x*(c+x*(b+a*x)) , etc
The quadratic formula is subject to loss of significant digits
when 0 < 4ac << b2 (much less than)
A safe way to find roots of a quadratic:
q = −( b + sign(b)√(b2−4ac) ) / 2 ,
then x1 = q/a , x2 = c / q