ftune.html

FORTRAN Performance Tuning co-guide

Freely distributable with acknowledgment

Chapter 1
Algorithm Tuning vs Code Tuning

Proper choice of algorithm certainly deserves to be put ahead of source code performance tuning. However, in many cases, an algorithm is well tested and accepted in a variety of applications, even though it may be far from optimum in many of them. One does not expect NASTRAN to adopt Preconditioned Conjugate Gradients, or CFD analysts to agree on the relative merits of pressure coupled relaxation or implicit or explicit time marching. On the other hand, it makes no sense to use relaxation to solve a system of equations with a full banded coefficient matrix.

Code tuning may improve reliability or speed. One of the first priorities should be to correct unreliable practices such as un-initialized storage. Simplification of floating point expressions may help control round off error. Don't blame an algorithm for sloppy implementation.

It may be well worth while to rationalize the format of source code ("pretty print") it in order to make constructs more easily recognized, and eliminate as far as possible obsolescent syntax which inhibits optimization or makes modifications error prone. Such reformatting may be done automatically to a reasonable extent by tools such as struct and ratfor, or commercial products such as Polyhedron's, Cobalt Blue's, or Pacific Sierra's.

An example of Pacific Sierra's results is shown in the Quetzal benchmark f90 source code. Their goal is to take advantage of optimizations performed by Cray style f90 compilers. While human readability is improved over the original, that has not been given the emphasis that the others give. The more difficult translations to f90 syntax are not carried out, so this author believes that those may be done as effectively manually, once the basic restructuring is done.