Copyright (C) 1998 Timothy C. Prince
Freely distributable with acknowledgment
branch history and prediction schemes: Uht, Sindagi, Somanathan "Branch Effect Reduction Techniques" IEEE Computer May 1997 pp 71-81.
Vander Wiel, Lilja "When Caches Aren't Enough: Data Prefetching Techniques: IEEE Computer July '97 pp 23-30.
celefunt: Cody's accuracy test suite for FORTRAN complex math functions netlib/toms714. Quite useful in its standard form, although not written for extended precision (like Intel).
directives: "Visual KAP for OpenMP User's Manual" www.kai.com/vkomp
divide/sqrt hardware techniques:
Soderquist, Leeser "Division and Square Root ..." IEEE Micro July/Aug'97 pp 56-66
egcs: directories under ftp.cygnus.com and many mirror sites
elefunt: Accuracy test suite for FORTRAN math functions. Has some portability problems (runs but results not right). Translated to C by Plauger and further modified by Prince. Copyright by Plauger, possibly available with permission.
Einarsson, Shokin "Fortran 90 for the Fortran 77 Programmer"
Computational Science Education Project "Fortran 90 and Computational Science".
f90 tutorial: Metcalf http://wwwcn.cern.ch/asdoc/WWW/f90/
Patrick Corde, Herve Delouis "Cours Fortran 90" idris.fr
f95 compilers and netlib software: many listed on www.fortran.com/fortran
look for modernized versions of netlib software elsewhere
f95: FORTRAN 95 Handbook, Adams, Brainerd et al MIT Press 1997 ISBN0-262-51096-0.
fused MAC effects etc:
http://http.cs.berkeley.edu/~wkahan/ieee754status/ieee754.ps Note that Kahan's quadratic code for fused MAC is not satisfactorily programmable in standard FORTRAN, but can be done reasonably in C.
g77: gnu or egcs mirror sites; CD versions tend to be out of date.
Kumar "the HP PA-8000 RISC CPU" IEEE Micro Mar/Apr '97 pp 27-32.
IEEE P754/854: Cody, IEEE Micro Aug. 1984 pp 84-100.
Intel Pentium Pro: Papworth "Tuning the Pentium Pro.." IEEE Micro April 1996 pp 8-15; Bhandarkar and Ding "Performance Characterization of the Pentium Pro" distributed by Internet.
latency and instruction level parallelism, Newton and Goldschmidt schemes: Soderquist, Leeser "Division and Square Root..." IEEE Micro July 1997 pp 56-66.
Alan Miller's site for modernized netlib: http://www.ozemail.com.au/~milleraj
MIPS/SGI R10000: Yeager "The MIPS R10000.." IEEE Micro April 1996 pp28-40.
pipelining: Smith, Weiss " PowerPC 601 and Alpha 21064..." IEEE Computer,
June 1994 pp 46-58