UTK - Math 371 - Alexiades

            Errors - Finite Precision - Roundoff

  • errors/approximations are everywhere   in problem formulation, data, algorithm, programming, arithmetic, ...
  • if X is an approximation to x:   absolute error = |X − x| ,   relative error = |X − x|/|x|   ( = | 1 − X/x | )
  • approximation correct to k significant (decimal) digits   means:   |error| ≤ ½ 10−k

    Machine numbers
  • Only finitely many floating point numbers exist on any computer   ( and NaN and Inf ).
  • They are NOT evenly spaced (same # between consecutive powers of 2) so spread out further and further apart.
  • Any given real number x must be approximated by a nearest machine number, fl(x), resulting in roundoff error.
  • There is a smallest, s, and a largest, L, positive machine number;   |x| < s   underflows to 0,   |x| > L   overflows to Inf
    Important consequencies:
    1. There is an inherent error in representing real numbers, called round-off.
    2. Loss of significant digits in arithmetic may occur:
      • unbalanced addition: when adding a very small to a very large real number
      • subtractive cancellation: when subtracting nearly equal reals
      • unbalanced multiplication: when nultiplying by a very small number (or dividing by a very large number).
    3. Machine arithmetic is not necessarily commutative, i.e. the order of operations may affect the result!
    4. Number of decimal digits that can be trusted:   up to 7 in single precision, up to 15 in double precision.
    5. Should NEVER ask for equality of computed real numbers, use if( abs(a − b) ≤ TOL ) , for some TOLerance
      ( with TOL no smaller than 1.e-15, for double precision )

    Machine epsilon εmach
  • εmach is the smallest positive number such that 1+εmach > 1
  • εmach = 2−p , p=precision = # of bits in mantissa.   In double precision, p=53, so   εmach = 2−53 ≈ 10−15
  • For any real x,   fl(x) = x(1 + ε), so   ε = (fl(x)−x)/x = relative error in representing x by fl(x), with ε ≤ εmach
  • Thus, the absolute error in representing x by fl(x) is   roundoff = fl(x)−x = εx , it is proportional to x ,
      so the larger the number the larger the roundoff error!
  • In Matlab: eps(x) gives the machine number nearest to x for that computer. eps(1) or just eps gives εmach itself.
  • For example: eps(1) = 2.2204e-16 , eps(100) = 1.4211e-14 , eps(1000) = 1.1369e-13 , eps(1.e6) = 1.1642e-10 , ...
                losing more and more accuracy...
  • For example: 1+eps == 1 gives false,   1+eps/2 == 1 gives true
                So, any (positive) number less than eps is effectively zero in computation!

    Some strategies for effective computing:
  • minimize number of operations
  • use double precision computations
  • guard against loss of digits, (subtraction of nearly equal numbers, etc)
      Try to re-write the expression somehow...
  • enter very big and very small numbers in scientific notation: e.g. 3.14e6 , 3.14e-6
  • print very big and very small numbers with '%e' formating.
  • use integer exponents in powers ( * for low powers: x*x, x*x*x )
  • do not round input, enter all digits known to be correct
  • evaluate polynomials in nested form (saves operations and may reduce roundoff):
    e.g. ax2 + bx + c = c + x*(b+a*x) ,   e.g. ax3 + bx2 + cx + d = d + x*(c+x*(b+a*x)) , etc
  • The quadratic formula is subject to loss of significant digits when   0 < 4ac << b2 (much less than)
      A safe way to find roots of a quadratic:   q = −( b + sign(b)√(b2−4ac) ) / 2 ,   then x1 = q/a ,   x2 = c / q