Timings of retbnch-mpi benchmark
Diffusion of second messengers in retinal rod cells
(nonlinear parabolic system)
Explicit scheme, parallel code
parallelized with MPI via domain decomposition
200, 400, 800 discs, (6+2 x 4+2) x Ndiscs grid.
Runs with Np = 2, 5, 10, 20,40 worker processors (+master).
Each run executes 12,049,753 time-steps.

efficiency = (timing on 2 procs)*2 / (timing on Np procs)*Np   on the same machine
                    = speedup on this machine per processor.
colt/machine = (timing on colt) / (timing on machine), both with Np procs
                    = speedup over colt with same number of processors.

Summary
 200 discs  ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
                         mem                 wall  effici  colt/  run    interconnect
machine      model    GHz GB  cmp  Np CPUsec hh:mm   ency machine   date
-------  ------------ --- -- ----- -- ------ -----  ----- ------ -------
---- 200 discs on 2 workers ---- 
ACF    I Intel Xeon64 2.6  4 ifort  2   2581  0:43  1      7.80  30sep17 Infiniband
darter   Cray XC30    2.6  2  Cray  2   3080  0:51  1      6.54! 12sep15 Cray Aries router
zeus2    opteron2376  2.6  2 ifort  2   5589  1:33  1      3.60   8feb10
newton I Intel Xeon64 3.2  4 ifort  2   6629  1:50  1      3.04   3apr07 Infiniband
newton   Intel Xeon64 3.2  4 ifort  2  10133  2:49  1      1.99   2may06
zeus     opteron 252  2.6  2  g77   2  10907  3:01  1      1.80  18dec07
oic      Intel Xeon64 3.4  4 ifort  2  10925  3:02  1      1.84   1may06
zeus     opteron 252  2.6  2  pgi   2  11176  3:06  1      1.80  14apr06
grig   M Intel Xeon   3.2  4  g77   2  11706  3:15  1      1.72   3dec05
erau   I Intel Xeon64 3.2  4  g77   2  12553  3:29  1      1.60  26mar07 Myrinet
newton I Intel Xeon64 3.2  4  g77   2  12611  3:30  1      1.60   2apr07 Infiniband
oic       Intel Xeon  3.4  4  g77   2  15150  4:12  1      1.33   1may06
frodo  M  opteron 240 1.4  2  g77   2  15712  4:22  1      1.41   3dec05
hawk      opteron 242 1.6  2  pf90  2  14333  3:59  1      1.41  28jan05
cheetah  IBM Power4   1.3  4  xlf   2  16243  4:31  1      1.24  25jul02
colt     AlphaSCev67  0.7  2  f95   2  20139  5:36  1       1     5jul02
hawk     opteron 242  1.6  2 ifort  2  24509  6:48  1      0.82  25jan05
hawk     opteron 242  1.6  2  g77   2  26508  7:22  1      0.76  28jan05
math     Intel Xeon   3.1  4  ifc   2  27741  7:42  1      0.73  12jan05
frodo    opteron 240  1.4  2  g77   2  34245  9:30  1      0.59  10jan05
eagle    IBM SP3      0.4  4  xlf   2  36857 10:14  1      0.55  12jul02
knox     Sun cluster  0.9  1  f77   2  51792 14:23  1      0.39  11jan05

---- 200 discs on 5 workers ---- 
darter   Cray XC30    2.6  2  Cray  5   1374  0:23  0.897  6.87! 12sep15 Cray Aries router
ACF    I Intel Xeon64 2.6  4 ifort  5   1538  0:26  0.671  6.14  30sep17 Infiniband 
newton I Intel Xeon64 3.2  4 ifort  5   3120  0:52  0.850  3.02   2apr07 Infiniband
zeus2    opteron2376  2.6  2 ifort  5   4903  1:22  0.456  1.92   8feb10
grig   M Intel Xeon   3.2  4  g77   5   5183  1:26  0.903  1.82   3dec05
newton I Intel Xeon64 3.2  4  g77   5   5600  1:33  0.901  1.69   2apr07 Infiniband
erau   I Intel Xeon64 3.2  4  g77   5   5812  1:37  0.864  1.62  26jun07 Myrinet
frodo  M opteron 240  1.4  2  g77   5   6686  1:51  0.940  1.41   3dec05
zeus     opteron 252  2.6  2  pgi   5   6912  1:55  0.65   1.36  19apr06
ZEus     opteron 252  2.6  2  g77   5   6927  1:55  0.65   1.36  14apr06
cheetah  IBM Power4   1.3  4  xlf   5   7014  1:57  0.923  1.35  25jul02
hawk     opteron 242  1.6  2  pf90  5   7372  2:02  0.778  1.28  28jan05
newton   Intel Xeon64 3.2  4 ifort  5   8178  2:16  0.496  1.15   2may06
oic      Intel Xeon   3.4  4 ifort  5   8506  2:22  0.514  1.11   1may06
colt     AlphaSCev67  0.7  2  f95   5   9437  2:37  0.854   1     7jul02
oic      Intel Xeon   3.4  4  g77   5  10042  2:47  0.603  0.94   1may06
hawk     opteron 242  1.6  2 ifort  5  10884  3:01  0.901  0.87  28jan05
math     Intel Xeon   3.1  4  ifc   5  12512  3:28  0.886  0.75  10jan05
hawk     opteron 242  1.6  2  g77   5  12542  3:29  0.845  0.75  28jan05
eagle    IBM SP3      0.4  4  xlf   5  18057  5:00  0.816  0.52  12jul02
frodo    opteron 240  1.4  2  g77   5  19188  5:20  0.714  0.49  11jan05
knox     Sun cluster  0.9  1  f77   5  34461  9:34  0.601  0.27   9jan05

---- 200 discs on 10 workers ---- 
ACF    I Intel Xeon64 2.6  4 ifort 10    613  0:10  0.842 11.10  30sep17 Infiniband
darter   Cray XC30    2.6  2  Cray 10   1661  0:28  0.371  4.10  12sep15 Cray Aries router
newton I Intel Xeon64 3.2  4 ifort 10   2050  0:34  0.647  3.32   2apr07 Infiniband
grig   M Intel Xeon   3.2  4  g77  10   3098  0:52  0.756  2.20   3dec05
newton I Intel Xeon64 3.2  4  g77  10   3396  0:57  0.743  2.00   2jun07 Infiniband
frodo  M opteron 240  1.4  2  g77  10   3727  1:02  0.843  1.83   3dec05
erau   I Intel Xeon64 3.2  4  g77  10   3866  1:04  0.649  1.76  26jun07 Myrinet
cheetah  IBM Power4   1.3  4  xlf  10   4016  1:07  0.809  1.69  25jul02
hawk     opteron 242  1.6  2  pf90 10   4457  1:14  0.643  1.53  25jan05
hawk     opteron 242  1.6  2 ifort 10   5583  1:33  0.878  1.22  25jan05
hawk     opteron 242  1.6  2  g77  10   6096  1:42  0.870  1.12  29jan05
zeus     opteron 252  2.6  2  pgi  10   6273  1:45  0.356  1.08  18apr06
colt     AlphaSCev67  0.7  2  f95  10   6806  1:53  0.592   1     4jul02
zeus     opteron 252  2.6  2  g77  10   6910  1:55  0.339  0.98  14apr06
newton   Intel Xeon64 3.2  4 ifort 10   7458  2:04  0.272  0.91   2may06
zeus2    opteron2376  2.6  2 ifort 10   8556  2:23  0.131  0.79   8feb10
oic       Intel Xeon  3.4  4 ifort 10   9387  2:36  0.725  0.23   1may06
oic       Intel Xeon  3.4  4  g77  10   9894  2:45  0.306  0.69   1may06
eagle    IBM SP3      0.4  4  xlf  10  12975  3:36  0.568  0.52  12jul02
frodo    opteron 240  1.4  2  g77  10  14729  4:05  0.465  0.46   9jan05
math     Intel Xeon   3.1  4  ifc  10  18384  5:06  0.302  0.37  11jan05
knox     Sun cluster  0.9  1  f77  10  31137  8:39  0.333  0.22   9jan05

---- 200 discs on 20 workers ---- 
ACF    I Intel Xeon64 2.6  4 ifort 21    541  0:09  0.454  ---    5oct17 Infiniband
darter   Cray XC30    2.6  2  Cray 20    550  0:09  0.560  ---   12sep15 Cray Aries router
grig   M Intel Xeon   3.2  4  g77  20   2109  0:35  0.555  ---    3dec05
frodo  M opteron 240  1.4  2  g77  20   2342  0:39  0.671  ---    3dec05
newton I Intel Xeon64 3.2  4  g77  20   2776  0:46  0.454  ---    1jun07 Infiniband
hawk     opteron 242  1.6  2  pf90 20   2849  0:47  0.503  ---   25jan05
erau   I Intel Xeon64 3.2  4  g77  20   2945  0:49  0.426  ---   26jun07 Myrinet
hawk     opteron 242  1.6  2 ifort 20   3943  1:06  0.622  ---   25jan05
hawk     opteron 242  1.6  2  g77  20   4188  1:10  0.663  ---   29jan05
zeus     opteron 252  2.6  2  g77  20   6141  1:42  0.182  ---   24jul07
zeus     opteron 252  2.6  2  pgi  20   6293  1:45  0.182  ---   25jul07
newton   Intel Xeon64 3.2  4 ifort 20   8062  2:14  0.126  ---    2may06
zeus2    opteron2376  2.6  2 ifort 20   9841  2:44  0.057  ----   8feb10
oic      Intel Xeon   3.4  4 ifort 20  10955  3:02  0.1    ---    1may06
oic      Intel Xeon   3.4  4  g77  20  11489  3:11  0.132  ---    1may06 
frodo    opteron 240  1.4  2  g77  20  13529  3:45  0.506  ---    9jan05
knox     Sun cluster  0.9  1  f77  20  39130 10:52  0.132  ---   12jan05

---- 200 discs on 40 workers ---- 
darter   Cray XC30    2.6  2  Cray 40    486  0:08  0.317  ---   12sep15 Cray Aries router
grig   M Intel Xeon   3.2  4  g77  40   1646  0:27  0.356  ---    3dec05 
hawk     opteron 242  1.6  2  pf90 40   2396  0:40  0.299  ---   25jan05
erau   I Intel Xeon64 3.2  4  g77  40   2767  0:46  0.227  ---   26jun07 Myrinet
hawk     opteron 242  1.6  2  g77  40   2859  0:47  0.464  ---   25jan05
hawk     opteron 242  1.6  2 ifort 40   2865  0:48  0.428  ---   25jan05
newton I Intel Xeon64 3.2  4  g77  40   3670  1:01  0.172  ---    1jun07 Infiniband
oic      Intel Xeon   3.4  4  g77  40  12622  3:30  0.06   ---    1may06
oic      Intel Xeon   3.4  4 ifort 40  12815  3:34  0.043  ---    1may06
frodo    opteron 240  1.4  2  g77  40  13844  3:51  0.124  ---   11jan05
zeus*    opteron 252  2.6  2  g77  40  23374  6:30  0.048  ---   16apr06
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  I: with Infiniband, Platform LSF_HPC, Ganglia scheduler
  M: with Myrinet MX, openPBS, Maui scheduler 
  *: zeus had only 9 dual compute nodes, so only 18 CPUs, at the time of this run 

To top of page 

 400 discs +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
                         mem                 wall  effici  colt/  run    interconnect
machine      model    GHz GB  cmp  Np CPUsec hh:mm   ency machine   date
-------  ------------ --- -- ----- -- ------ -----  ----- ------ -------
---- 400 discs on 2 workers ---- 
ACF    I Intel Xeon64 2.6  4 ifort  2   5932  1:38  1      6.61  30sep17 Infiniband
zeus2    opteron2376  2.6  2 ifort  2  10863  3:01  1      3.61   8feb10
newton I Intel Xeon64 3.2  4 ifort  2  15000  4:10  1      2.61   3apr07 Infiniband
newton   Intel Xeon64 3.2  4 ifort  2  15806  4:23  1      2.48   7may06
oic      Intel Xeon   3.4  4 ifort  2  19438  5:24  1      2.02   1may06
zeus     opteron 252  2.6  2  g77   2  21455  5:58  1      1.83  14apr06
zeus     opteron 252  2.6  2  pgi   2  21908  6:05  1      1.79  20apr06
grig   M Intel Xeon   3.2  4  g77   2  23555  6:33  1      1.66   3dec05
erau   I Intel Xeon64 3.2  4  g77   2  24782  6:53  1      1.58  26jun07 Myrinet
newton I Intel Xeon64 3.2  4  g77   2  26406  7:20  1      1.48   3apr07 Infiniband
oic      Intel Xeon   3.4  4  g77   2  27352  7:36  1      1.43   1may06
hawk     opteron 242  1.6  2  pf90  2  29173  8:06  1      1.34  25jan05
cheetah  IBM Power4   1.3  4  xlf   2  31834  8:51  1      1.23  26jul02
frodo  M opteron 240  1.4  2  g77   2  31915  8:52  1      1.23   3dec05
colt     AlphaSCev67  0.7  2  f95   2  39212 10:54  1       1     6jul02
hawk     opteron 242  1.6  2 ifort  2  49507 13:45  1      0.79  25jan05
hawk     opteron 242  1.6  2  g77   2  54027 15:00  1      0.73  25jan05
frodo    opteron 240  1.4  2  g77   2  65269 18:08  1      0.49  10jan05

---- 400 discs on 5 workers ---- 
ACF    I Intel Xeon64 2.6  4 ifort  5   2464  0:41  0.963  7.40  30sep17 Infiniband
darter   Cray XC30    2.6  2  Cray  5   2662  0:44  1      6.85  12sep15 Cray Aries router
darter   Cray XC30    2.6  2 gfort  5   3598  1:00  1      5.07  12sep15 Cray Aries router
newton I Intel Xeon64 3.2  4 ifort  5   5865  1:38  1.023  3.11   3apr07 on Infiniband
zeus2    opteron2376  2.6  2 ifort  5   6841  1:54  0.635  2.67   8feb10
grig   M Intel Xeon   3.2  4  g77   5   9727  2:42  0.969  1.88   3dec05
newton   Intel Xeon64 3.2  4 ifort  5  10123  2:49  0.625  1.80   7may06
zeus     opteron 252  2.6  2  g77   5  10477  2:55  0.819  1.74  14apr06
zeus     opteron 252  2.6  2  pgi   5  10502  2:55  0.819  1.74  19apr06
erau   I Intel Xeon64 3.2  4  g77   5  10705  2:58  0.93   1.70  26jun07 Myrinet
newton I Intel Xeon64 3.2  4  g77   5  10799  3:00  0.978  1.69   3apr07 Infiniband
hawk     opteron 242  1.6  2  pf90  5  12311  3:25  0.948  1.48  25jan05
oic      Intel Xeon   3.4  4 ifort  5  12332  3:26  0.63   1.48   1may06
frodo  M opteron 240  1.4  2  g77   5  13069  3:37  0.977  1.40   3dec05
cheetah  IBM Power4   1.3  4  xlf   5  13233  3:41  0.962  1.38  26jul02
oic      Intel Xeon   3.4  4  g77   5  14466  4:01  0.756  1.26   1may06
colt     AlphaSCev67  0.7  2  f95   5  18239  5:04  0.860   1     4jul02
hawk     opteron 242  1.6  2 ifort  5  20070  5:34  0.987  0.91  25jan05
hawk     opteron 242  1.6  2  g77   5  21828  6:04  0.990  0.84  25jan05
frodo    opteron 240  1.4  2  g77   5  31452  8:44  0.830  0.58  11jan05
eagle    IBM SP3      0.4  4  xlf   5  35830  9:57   ---   0.51   7jul02

---- 400 discs on 10 workers ---- 
ACF    I Xeon64 2.6  4 ifort 10   1326  0:22  0.895  8.10  30sep17 Infiniband
darter   Cray XC30    2.6  2  Cray 10   1467  0:24  0.907  7.31  12sep15 Cray Aries router
darter   Cray XC30    2.6  2 gfort 10   1902  0:32  0.946  5.64  12sep15 Cray Aries router
newton I Intel Xeon64 3.2  4 ifort 10   4185  1:10  0.717  2.56   3apr07 on Infiniband
grig   M Intel Xeon   3.2  4  g77  10   5438  1:30  0.866  1.97   3dec05
erau   I Intel Xeon64 3.2  4  g77  10   6241  1:44  0.794  1.72  26jun07 Myrinet
hawk     opteron 242  1.6  2  pf90 10   6409  1:47  0.910  1.67  25jan05
newton I Intel Xeon64 3.2  4  g77  10   6719  1:52  0.786  1.60   3apr07 Infiniband
frodo  M opteron 240  1.4  2  g77  10   6940  1:55  0.920  1.55   3dec05
cheetah  IBM Power4   1.3  4  xlf  10   7161  1:59  0.889  1.50  25jul02
zeus     opteron 252  2.6  2  pgi  10   8135  2:15  0.539  1.32  20apr06
zeus     opteron 252  2.6  2  g77  10   8259  2:17  0.520  1.30  15apr06
newton   Intel Xeon64 3.2  4 ifort 10   9285  2:35  0.340  1.16   6may06
zeus2    opteron2376  2.6  2 ifort 10   9649  2:41  0.225  1.11   8feb10
colt     AlphaSCev67  0.7  2  f95  10  10729  2:59  0.731   1     4jul02
oic      Intel Xeon   3.4  4 ifort 10  10864  3:01  0.358  0.99   1may06
hawk     opteron 242  1.6  2 ifort 10  11037  3:04  0.897  0.97  25jan05
hawk     opteron 242  1.6  2  g77  10  11326  3:09  0.954  0.95  25jan05
oic      Intel Xeon   3.4  4  g77  10  14810  4:07  0.369  0.72   1may06
frodo    opteron 240  1.4  2  g77  10  20978  5:49  0.622  0.51  12jan05

---- 400 discs on 20 workers ---- 
ACF    I Intel Xeon64 2.6  4 ifort 20    682  0:11  0.870  8.10   4oct17 Infiniband
darter   Cray XC30    2.6  2  Cray 20   1008  0:17  0.660  6.875 12sep15 Cray Aries router
darter   Cray XC30    2.6  2 gfort 20   1158  0:19  0.777  5.98  12sep15 Cray Aries router
grig   M Intel Xeon   3.2  4  g77  20   3303  0:55  0.713  2.10   3dec05
newton I Intel Xeon64 3.2  4  g77  20   3460  0:58  0.763  2.00   1jun07 Infiniband
frodo  I opteron 240  1.4  2  g77  20   3897  1:04  0.819  1.78   3dec05
erau   I Intel Xeon64 3.2  4  g77  20   4267  1:11  0.581  1.62  26jun07 Myrinet
cheetah  IBM Power4   1.3  4  xlf  20   4131  1:09  0.771  1.68  25jul02
hawk     opteron 242  1.6  2  pf90 20   4289  1:11  0.680  1.62  25jan05
hawk     opteron 242  1.6  2 ifort 20   6647  1:51  0.745  1.04  25jan05
hawk     opteron 242  1.6  2  g77  20   6799  1:53  0.795  1.02  25jan05
colt     AlphaSCev67  0.7  2  f95  20   6930  1:55  0.566   1     6jul02
zeus     opteron 252  2.6  2  g77  20   7022  1:57  0.306  0.98  25jul07
zeus     opteron 252  2.6  2  pgi  20   7280  2:01  0.301  0.95  25jul07
newton   Intel Xeon64 3.2  4 ifort 20   9155  2:33  0.173  0.76   7may06
zeus2    opteron2376  2.6  2 ifort 20  10593  2:57  0.103  0.65   8feb10
oic      Intel Xeon   3.4  4 ifort 20  12327  3:25  0.158  0.56   1may06
oic      Intel Xeon   3.4  4  g77  20  13564  3:46  0.202  0.51   1may06 
frodo    opteron 240  1.4  2  g77  20  16564  4:36  0.394  0.39   9jan05

---- 400 discs on 40 workers ---- 
darter   Cray XC30    2.6  2 gfort 40    730  0:12  0.616  7.12  12sep15 Cray Aries router
darter   Cray XC30    2.6  2  Cray 40   1008  0:17  0.660  6.875 12sep15 Cray Aries router
grig   M Intel Xeon   3.2  4  g77  40   2336  0:39  0.504  2.23   3dec05
frodo  M opteron 240  1.4  2  g77  40   2576  0:43  0.619  2.02   3dec05
hawk     opteron 242  1.6  2  pf90 40   3149  0:52  0.463  1.65  25jan05
newton I Intel Xeon64 3.2  4  g77  40   3494  0:58  0.378  1.49   1jun07 Infiniband
hawk     opteron 242  1.6  2 ifort 40   4080  1:08  0.607  1.27  25jan05
hawk     opteron 242  1.6  2  g77  40   4145  1:09  0.652  1.25  25jan05
erau   I Intel Xeon64 3.2  4  g77  40   4183  1:10  0.296  1.24  26jun07 Myrinet
colt     AlphaSCev67  0.7  2  f95  40   5201  1:26  0.377   1     8jul02
oic      Intel Xeon   3.4  4 ifort 40  13420  3:44  0.072  0.39   1may06
oic      Intel Xeon   3.4  4  g77  40  13922  3:52  0.098  0.37   1may06
frodo    opteron 240  1.4  2  g77  40  15257  4:14  0.214  0.34   9jan05
zeus*    opteron 252  2.6  2  g77  40  22683  6:18  0.047  0.23   5sep06
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  I: with Infiniband, Platform LSF_HPC, Ganglia scheduler
  M: with Myrinet MX, openPBS, Maui scheduler 
  *: zeus had only 9 dual compute nodes, so only 18 CPUs, at the time of this run 


To top of page 
 800 discs +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
                         mem                 wall  effici  colt/  run    interconnect
machine      model    GHz GB  cmp  Np CPUsec hh:mm   ency machine   date
-------  ------------ --- -- ----- -- ------ -----  ----- ------ -------
---- 800 discs on 2 workers ----
ACF    I Intel Xeon64 2.6  4 ifort  2  11685  3:15  1      ----   5oct17 Infiniband
zeus2    opteron2376  2.6  2 ifort  2  21418  5:57  1      ----   8feb10
newton I Intel Xeon64 3.2  4 ifort  2  43599 12:07  1      ----   3apr07 Infiniband
zeus     opteron 252  2.6  2  g77   2  46879 13:01  1      ----  25mar07
zeus     opteron 252  2.6  2  pgi   2  52047 14:27  1      ----  26jul07
grig   M Intel Xeon   3.2  4  g77   2  48008 13:20  1      ----  25mar07
erau   I Intel Xeon64 3.2  4  g77   2  51813 14:24  1      ----  26jun07 Myrinet
newton I Intel Xeon64 3.2  4  g77   2  61421 17:04  1      ----   3apr07 Infiniband

---- 800 discs on 5 workers ----
ACF    I Intel Xeon64 2.6  4 ifort  5   6035  1:40  0.774  ----   5oct17 Infiniband
darter   Cray XC30    2.6  2  Cray  5   5081  1:25  1      ----  12sep15 Cray Aries router
zeus2    opteron2376  2.6  2 ifort  5  11035  3:04  0.776  ----   8feb10
newton I Intel Xeon64 3.2  4 ifort  5  12856  3:34  1.356  ----   3apr07 Infiniband
zeus     opteron 252  2.6  2  g77   5  16650  4:37  1.262  ----  26mar07
zeus     opteron 252  2.6  2  pgi   5  18570  5:10  1.121  ----  26jul07
newton I Intel Xeon64 3.2  4  g77   5  20169  5:36  1.218  ----   3apr07 Infiniband
erau   I Intel Xeon64 3.2  4  g77   5  20771  5:46  0.998  ----  26jun07 Myrinet
grig   M Intel Xeon   3.2  4  g77   5  35287  9:48  0.544  ----  25mar07

---- 800 discs on 10 workers ----
ACF    I Intel Xeon64 2.6  4 ifort 10   2513  0:42  0.930  ----   5oct17 Infiniband
darter   Cray XC30    2.6  2  Cray 10   2704  0:45  0.940  ----  12sep15 Cray Aries router
newton I Intel Xeon64 3.2  4 ifort 10   5714  1:35  1.526  ----   3apr07 Infiniband
newton I Intel Xeon64 3.2  4  g77  10  10670  2:58  1.151  ----   3apr07 Infiniband
zeus     opteron 252  2.6  2  g77  10  10744  2:59  0.872  ----  25mar07
zeus     opteron 252  2.6  2  pgi  10  11813  3:17  0.881  ----  26jul07
zeus2    opteron2376  2.6  2 ifort 10  11836  3:17  0.362  ----   8feb10
erau   I Intel Xeon64 3.2  4  g77  10  11066  3:04  0.936  ----  26jun07 Myrinet
grig   M Intel Xeon   3.2  4  g77  10  29348  8:09  0.327  ----  25mar07

---- 800 discs on 20 workers ----
ACF    I Intel Xeon64 2.6  4 ifort 20   1213  0:20  0.963  ----   5oct17 Infiniband
darter   Cray XC30    2.6  2  Cray 20   1560  0:27  0.814  ----  12sep15 Cray Aries router
newton I Intel Xeon64 3.2  4  g77  20   4735  1:19  1.297  ----   2jun07 Infiniband
erau   I Intel Xeon64 3.2  4  g77  20   6808  1:53  0.761  ----  26jun07 Myrinet
zeus     opteron 252  2.6  2  g77  20   8576  2:23  0.547  ----  25jul07
zeus     opteron 252  2.6  2  pgi  20   9158  2:33  0.568  ----  25jul07
grig   M Intel Xeon   3.2  4  g77  20   9136  2:32  0.525  ----  26mar07
zeus2    opteron2376  2.6  2 ifort 20  12192  3:23  0.176  ----   8feb10

---- 800 discs on 40 workers ----
darter   Cray XC30    2.6  2  Cray 40    978  0:16  0.649  ----  12sep15 Cray Aries router
newton I Intel Xeon64 3.2  4  g77  40   4025  1:07  0.763  ----   1jun07 Infiniband
erau   I Intel Xeon64 3.2  4  g77  40   4111  1:09  0.630  ----  26jun07 Myrinet

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  I: with Infiniband, Platform LSF_HPC, Ganglia scheduler
  M: with Myrinet MX, openPBS, Maui scheduler 
  *: zeus had only 9 dual compute nodes, so only 18 CPUs, at the time of this run 
efficiency = (timing on 2 procs)*2 / (timing on Np procs)*Np   on the same machine
                    = speedup on this machine per processor.
colt/machine = (timing on colt) / (timing on machine), both with Np procs
                    = speedup over colt with same number of processors.

If you would be willing to run the benchmark on another machine
please email me at alexiades(at)utk.edu

Details
To top of page 

colt.ccs.ornl.gov   64 nodes (256 cpus)   decommissioned in 2004 falcon.ccs.ornl.gov   256 nodes (1024 cpus)   decommissioned in 2004 Compaq AlphaServer SC, 4 SMP CPUs per node, 2GB RAM CPU: ES40 processor: 21264a (ev67), 667 MHz, 64KB I-cache, 64KB D-cache, 8MB L2 cache uname -a: OSF1 colt0 V5.1 732 alpha f90 Compaq Fortran Compiler X5.4A-1684-46B5P f90 -fast -O5 -tune ev67 -lfmpi -lmpi -lelan To top of page
eagle.ccs.ornl.gov       decommissioned in 2004 IBM SP 184 4-way Winterhawk II nodes, 4 SMP CPUs per node, 2GB RAM CPU: Power3-II processor, 375 MHz, 8MB L2 cache uname -a: AIX eagle164s 3 4 000105944C00 XL Fortran Compiler 7.1.1.2 mpixlf -O4 -qnoipa To top of page
cheetah.ccs.ornl.gov       decommissioned in 2006? IBM pSeries System (p690) 27 "Regatta" nodes, each with 32 processors on 16 chips CPU: 1.3 GHz Power4 processor, 64 KB L1 cache, 32 KB D-cache, 1.5 MB L2 cache estimated computational power 4.5 TeraFLOP/s uname -a: AIX cheetah0033 1 5 00207D8A4C00 xlf version 7.1.1.3 mpixlf_r -g -O4 -qnoipa run from GPFS area ------- LoadLeveler script called by make: ----------- #@ job_type = parallel #@ network.MPI = csss,shared,US #@ blocking = unlimited #@ total_tasks = 11 setenv MP_SHARED_MEMORY yes poe mpiret.x < dat > out-N400.ch11 with mpxlf (no _r): Ndisc=200: 5 6811 1:53 25jul02 no_r with mpxlf (no _r): Ndisc=200: 10 3831 1:03 24jul02 no_r with mpxlf (no _r): Ndisc=400: 10 7840 2:11 25jul02 no_r To top of page
knox.rgrid.utk.edu       decommissioned in 2005 32 nodes: Sun UltraSparc 900MHz 1MB uname -a: SunOS knox1 5.9 Generic_112233-11 sun4u sparc SUNW,Sun-Fire-280R f95 -V: Forte Developer 7 Fortran 95 7.0 2002/03/09 f77 -O4 or f95 -fast -O4 mpich: mpif77 -O4 Ndisc Np CPUs hr:min date ----- -- ---- ------ ------- 200 2 51792 14:23:11 11jan05 used 81%cpu 200 5 34461 9:34:21 10jan05 used 90%cpu 200 10 31137 8:38:56 9jan05 200 20 39130 10:52:09 12jan05 used 23%cpu To top of page
fatou.math.utk.edu cpu=3 fubini.math.utk.edu cpu=3 dual Intel Xeon 3GHz 200 3 27741 7:42:20 on fatou1, fubini2 10jan05 93%cpu 200 5 12512 3:28:31 on fatou3, fubini3 10jan05 200 11 18384 5:06:24 on fa3, fu,tu3,ag2 11jan05 To top of page
grig.sinrg.cs.utk.edu 64 node Xeon linux cluster jan05 dual Intel Xeon 3.2GHz 1024KB cache 4GB mem with Myrinet MX, openPBS, Maui scheduler uname -a: Linux grig-head 2.6.10 #8 SMP 86_64 gcc -v: gcc version 3.4.3 mpif77 -O3 -Wno-globals same PBSscript as for frodo Ndisc Np nodes CPUs hr:mm:ss date ----- -- ----- ----- -------- ------- grig 200 2 2 11706 3:15:06 3dec05 grig 200 5 3 5183 1:26:23 3dec05 grig 200 10 6 3098 0:51:38 3dec05 grig 200 20 11 2109 0:35:09 3dec05 grig 200 40 21 1646 0:27:25 3dec05 Ndisc Np nodes CPUs hr:mm:ss date ----- -- ----- ----- -------- ------- grig 400 2 2 23555 6:32:34 4dec05 grig 400 5 3 9727 2:42:07 3dec05 grig 400 10 6 5438 1:30:38 3dec05 grig 400 20 11 3303 0:55:03 3dec05 grig 400 40 21 2336 0:38:56 3dec05 Ndisc Np nodes CPUs hr:mm:ss date ----- -- ----- ----- -------- ------- grig 800 2 2 48008 13:20:10 25mar07 grig 800 5 3 35287 9:48:06 26mar07 grig 800 10 6 29348 8:09:08 25mar07 grig 800 20 11 9136 2:32:22 26mar07 grig 800 40 21 too slow, exceeded walltime 79205 To top of page
frodo.sinrg.cs.utk.edu 60 node Opteron linux cluster NEW version dec05 dual AMD Opteron 240 1.4GHz 1024KB cache 2GB mem with Myrinet MX, openPBS, Maui scheduler uname -a: Linux frodo-head 2.6.13.4 #1 SMP 86_64 gcc -v: gcc version 3.4.3 mpif77 -O3 -Wno-globals same PBSscript as below Ndisc Np nodes CPUs hr:mm:ss date ----- -- ----- ----- -------- ------- frodo 200 3 2 15712 4:21:52 3dec05 frodo 200 5 3 6686 1:51:25 3dec05 frodo 200 10 6 3727 1:02:07 3dec05 frodo 200 20 11 2342 0:39:02 3dec05 frodo 200 40 21 Ndisc Np nodes CPUs hr:mm:ss date ----- -- ----- ----- -------- ------- frodo 400 3 2 31915 8:51:55 4dec05 frodo 400 5 3 13069 3:37:48 2dec05 frodo 400 10 6 6940 1:55:40 3dec05 frodo 400 20 11 3897 1:04:56 3dec05 frodo 400 40 21 2576 0:42:55 2dec05 frodo.sinrg.cs.utk.edu 64 node Opteron linux cluster OLD jan05 dual AMD Opteron 240 1.4GHz 1024KB cache 2GB mem with Gigabit, openPBS uname -a: Linux head 2.4.19-NUMA #1 SMP x86_64 gcc -v: gcc version 3.2.2 (SuSE Linux) mpif77 -O3 -Wno-globals ------- sample PBS script called by make: ----------- #PBS -l nodes=6:ppn=2 #PBS -N retbnch11 #PBS -l walltime=10:00:00 #PBS -m ae #PBS -j oe #PBS -k oe cd /data/vasili/retbnch-mpi/N200fr11 mpirun -np 11 -machinefile $PBS_NODEFILE mpiret.x < dat > OUT (std output turned off:dtout>tmax, else dies at tout) Ndisc Np nodes CPUs hr:min:s cpu% date ----- -- ----- ----- -------- ---- ------- on node4,5 200 2 2 34245 9:30:45 90% 10jan05 119.010u 269.410s 9:30:51.44 1.1% on node3,4,5 200 5 3 19188 5:19:48 70% 11jan05 173.560u 551.260s 5:19:49.86 3.7% 200 10 6 14729 4:05:28 9jan05 225.070u 977.120s 4:05:31.66 8.1% 200 10 3 24723 6:52:03 much slower than on 6nodes 332.670u 1144.940s 6:52:06.05 5.9% 200 20 11 13529 3:45:29 9jan05 313.620u 1880.94s 3:45:34.11 16.2% 200 40 21 13844 3:50:44 11jan05 426.250u 3272.60s 3:51:06.23 26.6% Ndisc Np nodes CPUs hr:min:s cpu% date job ----- -- ----- ----- -------- ---- ------- on node1,2 400 2 2 65269 18:07:49 96 10jan05 126.380u 466.640s 18:07:50.37 0.9% on node6,7,8 400 5 3 31452 8:44:11 81% 11jan05 179.550u 538.920s 8:44:13.77 2.2% on 33-38 400 10 6 20978 5:49:37 67% 12jan05 235.040u 974.600s 5:49:46.56 5.7% 400 20 11 16564 4:36:03 9jan05 306.230u 1884.040s 4:36:08.92 13.2% 400 40 21 15257 4:14:17 9jan05 433.450u 3413.740s 4:14:26.08 25.2% To top of page
hawk.csm.ornl.gov 50 node linux cluster     decommissioned in 2006? dual AMD Opteron 242 1.6GHz 1024KB cache 2GB mem uname -a: Linux hawk1 2.6.7-4.23qsnet #5 SMP x86_64 GNU/Linux gcc -v: gcc version 3.3.3 (SuSE Linux) mpif77 -O3 -fPIC -fno-automatic -finit-local-zero -Wno-globals ifort -v: Version 8.1 ifort -O3 -fpic -save -w95 -FI pathscale EKO Version 1.4 gcc version 3.3.1 (PathScale 1.4 driver) pathf90 -Ofast -fpic -static-data -msse2 -Wno-globals -fno-second-underscore compiled with -L/usr/lib/mpi/mpi_gnu/lib -lfrtbegin -lg2c -lmpifarg -lmpi -lelan -lfrtbegin -lgcc -lc execute: prun -np 11 mpiret.x < dat > OUT (std output turned off:dtout>tmax, else dies at tout) Ndisc Np nodes CPUs hh:mm:ss date hawk nodes ----- -- ----- ----- -------- ------- ifort 100 10 6 3173 0:52:52 25jan05 ifort 200 2 3 24509 6:48:29 25jan05 [30-32] ifort 200 5 6 10884 3:01:23 25jan05 [7-12] ifort 200 10 11 5583 1:33:02 25jan05 ifort 200 20 11 ?18066 ? 5:01:05 25jan05 <-- should rerun ifort 200 ,, 12 3943 1:05:43 29jan05 [12,29,36-44] ifort 200 40 21 2865 0:47:44 30jan05 [2-15,36-42] g77(mpif77) 200 2 2 26508 7:21:48 27jan05 [15-16] g77(mpif77) 200 5 3 12542 3:29:02 29jan05 [2-4] g77(mpif77) 200 10 11 6096 1:41:35 29jan05 [13-23] g77(mpif77) 200 20 11 4188 1:09:47 29jan05 [15,29,36-44] g77(mpif77) 200 40 21 2859 0:47:39 25jan05 pathf90 200 2 2 14333 3:58:52 27jan05 [30-32] pathf90 200 5 3 7372 2:02:52 27jan05 [2-4] pathf90 200 ,, 6 6351 1:45:50 26jan05 [30-35] pathf90 200 10 6 4457 1:14:17 27jan05 [2-6,13] pathf90 200 20 12 2849 0:47:29 27jan05 [33-44] pathf90 200 40 26 2396 0:39:56 27jan05 [2-14,29,33-44] pgi: pgf90 200 20 11 3dec05 [1-5,7-12] ifort 400 2 2 49507 13:45:06 29jan05 [15-16] ifort 400 5 6 20070 5:34:29 28jan05 [30-35] ifort 400 10 9 11037 3:03:56 28jan05 [5-6,22-28] ifort 400 20 11 6647 1:50:47 28jan05 [2-12] ifort 400 40 27 4080 1:07:59 28jan05 [2-14,16-29] g77(mpif77) 400 2 2 54027 15:00:27 25jan05 [33-35] g77(mpif77) 400 5 6 21866 6:04:26 27jan05 [16-21] g77(mpif77) 400 ,, 6 21828 6:03:48 29jan05 [16-21] g77(mpif77) 400 10 6 12588 3:29:48 25jan05 [2-6,28] g77(mpif77) 400 ,, 11 11326 3:08:45 27jan05 [16-26] g77(mpif77) 400 20 11 6799 1:53:19 25jan05 [27,29,36-44] g77(mpif77) 400 ,, 11 6760 1:52:39 29jan05 [15,29,36-44] g77(mpif77) 400 ,, 12 6759 1:52:39 27jan05 [33-44] g77(mpif77) 400 40 21 4145 1:09:05 25jan05 pathf90 400 2 2 29173 8:06:13 26jan05 [15-16] pathf90 400 5 6 12311 3:25:10 29jan05 [7-12] pathf90 400 10 11 6409 1:46:49 26jan05 [16-26] pathf90 400 20 11 4289 1:11:29 26jan05 [16,29,36-44] pathf90 400 40 27 3149 0:52:28 26jan05 To top of page
zeus.math.utk.edu 9+headnode Opteron 252 linux cluster dual AMD Opteron 252 2.6GHz 1024KB cache 2GB mem uname -a: Linux 2.6.12-1.1381_FC3smp x86_64 GNU/Linux gigabit interconnect, PBS(torque) scheduler, MPICH g77 -v: gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) compiled with: /opt/mpich/p4-gnu/bin/mpif77 -O3 -fPIC -fno-automatic -finit-local-zero -Wno-globals executed with: /opt/mpich/p4-gnu/bin/mpirun -nolocal -np 11 -machinefile $PBS_NODEFILE mpiret.x < dat > OUT Ndisc Np nodes CPUs hh:mm:ss date zeus nodes ----- -- ----- ----- -------- ------- g77(mpif77) 200 2 2 10907 3:01:46 18dec07 g77(mpif77) 200 2 2 11197 3:06:37 14apr06 7,8 g77(mpif77) 200 5 3 6927 1:55:27 15apr06 5-7 g77(mpif77) 200 10 6 6910 1:55:09 14apr06 1-6 g77(mpif77) 200 20* 9 11699 3:14:58 15apr06 1-9 g77(mpif77) 200 20 11 6141 1:42:20 24jul07 1-11 g77(mpif77) 200 40* 9 23374 6:29:33 16apr06 1-9 g77(mpif77) 400 2 2 21455 5:57:34 14apr06 1,9 g77(mpif77) 400 5 3 10477 2:54:36 15apr06 3-5 g77(mpif77) 400 10 6 8259 2:17:39 14apr06 1-6 g77(mpif77) 400 20* 9 12723 3:32:03 15apr06 1-9 g77(mpif77) 400 20 11 7022 1:57:02 25jul07 1-11 g77(mpif77) 400 40* 9 22683 6:18:02 5sep06 1-9 g77(mpif77) 800 2 2 46879 13:01:18 25mar07 2,8 g77(mpif77) 800 5 3 16650 4:37:29 25mar07 g77(mpif77) 800 10 6 10744 2:59:04 25mar07 g77(mpif77) 800 20 11 8576 2:22:55 25jul07 1-11 pgf90(mpif90) 200 2 2 12132 3:22:12 17apr06 run by Ben pgf90(mpif90) 200 2 2 11176 3:06:15 18apr06 < 7,3 pgf90(mpif90) 200 5 3 6950 1:55:49 17apr06 pgf90(mpif90) 200 5 3 6912 1:55:12 19apr06 < pgf90(mpif90) 200 10 6 6273 1:44:33 18apr06 < pgf90(mpif90) 200 10 6 6347 1:45:47 8may06 pgf90(mpif90) 200 20* 9 extremely slow,killed it 8may06 pgf90(mpif90) 200 20 11 6293 1:44:52 25jul07 slower than g77 pgf90(mpif90) 400 2 2 21908 6:05:08 20apr06 pgf90(mpif90) 400 5 3 10502 2:55:01 19apr06 pgf90(mpif90) 400 10 6 8135 2:15:34 20apr06 pgf90(mpif90) 400 20* 9 extremely slow, killed it 9may06 pgf90(mpif90) 400 20 11 7280 2:01:20 25jul07 slower than g77 pgf90(mpif90) 800 2 2 52047 14:27:27 26jul07 slower than g77 pgf90(mpif90) 800 5 3 18570 5:09:30 26jul07 slower than g77 pgf90(mpif90) 800 10 6 11813 3:16:52 26jul07 slower than g77 pgf90(mpif90) 800 20 11 9158 2:32:37 25jul07 slower than g77 pgf90(mpif90) 800 26 13 8867 2:27:47 26jul07 slower than g77 * zeus had only 9 nodes, so only 18 cpus To top of page
oic.ornl.gov 325 node Xeon linux cluster dual Intel Xeon 3.4GHz 2048KB cache 4GB mem uname -a: Linux b06l02 2.6.9-22.0.2.ELsmp #1 SMP x86_64 GNU/Linux ? gigabit interconnect, PBS(torque)scheduler, Maui mgr , MPICH ifort -V: Intel(R) Fortran Compiler for Intel(R) EM64T-based applications, Version 9.0 Build 20051201 compiled with: /opt/mpich-ch_p4-icc-1.2.7/bin/mpif90 -O3 -fpic -save -w95 -FI executed with(in PBSscript): mpiexec -n 11 ./mpiret.x < dat > OUT Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- ifort(mpif90) 200 3 2 10925 3:02:05 1may06 ifort(mpif90) 200 5 3 8506 2:21:45 1may06 ifort(mpif90) 200 10 6 9387 2:36:26 1may06 ifort(mpif90) 200 20 11 10955 3:02:35 1may06 ifort(mpif90) 200 40 21 12815 3:33:35 1may06 ifort(mpif90) 400 3 2 19438 5:23:57 1may06 ifort(mpif90) 400 5 3 12332 3:25:31 1may06 ifort(mpif90) 400 10 6 10864 3:01:04 1may06 ifort(mpif90) 400 20 11 12327 3:25:26 1may06 ifort(mpif90) 400 40 21 13420 3:43:40 1may06 g77 -v: gcc version 3.4.4 20050721 (Red Hat 3.4.4-2) compiled with: /opt/mpich-ch_p4-gcc-1.2.7/bin/mpif77 -O3 -finit-local-zero -Wno-globals executed with: /opt/mpich/p4-gnu/bin/mpirun -nolocal -np 11 -machinefile $PBS_NODEFILE mpiret.x < dat > OUT Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- g77 (mpif77) 200 3 2 15150 4:12:30 1may06 g77 (mpif77) 200 5 3 10042 2:47:22 1may06 g77 (mpif77) 200 10 6 9894 2:44:54 1may06 g77 (mpif77) 200 20 11 11489 3:11:28 1may06 g77 (mpif77) 200 40 21 12622 3:30:21 1may06 g77 (mpif77) 400 3 2 27352 7:35:51 1may06 g77 (mpif77) 400 5 2 14466 4:01:06 1may06 g77 (mpif77) 400 10 6 14810 4:06:50 1may06 g77 (mpif77) 400 20 11 13564 3:46:04 1may06 g77 (mpif77) 400 40 21 13922 3:52:01 1may06 To top of page
newton.usg.utk.edu (head of 36-node linux cluster) 32 compute nodes: dual Xeon_64 3.2GHz uname -a: Linux 2.6.9-11.ELsmp #1 SMP x86_64 x86_64 GNU/Linux ifort in /opt/intel/fce/9.0/bin/ifort: Intel(R) Fortran Compiler for Intel(R) EM64T-based v 9.0 Build 20050809 compiled with /opt/mpich/intel/bin/mpif90 -fast -save -w95 -FI ran with /opt/mpich/intel/bin/mpirun -np ... may06 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- ifort(mpif90) 200 2 2 10133 2:48:52 2may06 faster than oic ifort(mpif90) 200 5 3 8178 2:16:18 2may06 ifort(mpif90) 200 10 6 7458 2:04:18 6may06 poor on high Np ifort(mpif90) 200 20 11 8062 2:14:22 7may06 ifort(mpif90) 400 2 2 15806 4:23:26 7may06 ifort(mpif90) 400 6 3 10123 2:48:42 7may06 ifort(mpif90) 400 10 6 9285 2:34:45 6may06 ifort(mpif90) 400 20 11 9155 2:32:34 7may06 Platform LSF 6.2 on infiniband compiled with /opt/intel/fce/9.0/bin/ifort ran with /usr/local/topspin/mpi/mpich/bin/mpirun.lsf on infiniband: apr07 runs Ndisc Np nodes CPUs hh:mm:ss date queue ----- -- ----- ----- -------- ------- -------- ifort(mpif90) 200 2 2 6629 1:50:29 3apr07 Back08 ifort(mpif90) 200 5 3 3120 0:52:00 3apr07 QuadQuad ifort(mpif90) 200 10 6 2050 0:34:10 3apr07 Back08 ifort(mpif90) 400 2 2 15000 4:10:00 3apr07 QuadQuad ifort(mpif90) 400 5 3 5865 1:37:45 3apr07 QuadQuad ifort(mpif90) 400 10 6 4185 1:09:45 3apr07 QuadQuad ifort(mpif90) 800 2 2 43599 12:06:39 3apr07 QuadQuad ifort(mpif90) 800 5 3 12856 3:34:16 3apr07 QuadQuad ifort(mpif90) 800 10 6 5714 1:35:14 3apr07 Back08 g77 -v: gcc version 3.4.6 20060404 (Red Hat 3.4.6-3) g77 -O3 -finit-local-zero -Wno-globals 2apr07 compiled with: /usr/local/topspin/mpi/mpich/bin/mpif77 ran with /usr/local/topspin/mpi/mpich/bin/mpirun.lsf on infiniband: apr,jun07 runs Ndisc Np nodes CPUs hh:mm:ss date queue ----- -- ----- ----- -------- ------- -------- g77(mpif77) 200 2 2 12611 3:30:10 2apr07 QuadQuad g77(mpif77) 200 5 3 5600 1:33:20 2apr07 Back04 g77(mpif77) 200 5 3 5809 1:36:48 2apr07 QuadQuad g77(mpif77) 200 10 6 4294 1:11:34 2apr07 QuadQuad g77(mpif77) 200 10 6 3396 0:56:35 2jun07 16CoreQueue g77(mpif77) 200 20 11 2776 0:46:15 1jun07 64CoreQueue g77(mpif77) 200 40 21 3670 1:01:10 1jun07 64CoreQueue g77(mpif77) 400 2 2 26406 7:20:06 3apr07 QuadQuad g77(mpif77) 400 5 3 10799 2:59:59 3apr07 QuadQuad g77(mpif77) 400 10 6 6719 1:51:59 3apr07 QuadQuad g77(mpif77) 400 20 11 3460 0:57:40 1jun07 64CoreQueue g77(mpif77) 400 40 21 3494 0:58:13 1jun07 64CoreQueue g77(mpif77) 800 2 2 61421 17:03:41 3apr07 QuadQuad g77(mpif77) 800 5 3 20169 5:36:09 3apr07 Back04 g77(mpif77) 800 10 6 10670 2:57:49 3apr07 Back08 g77(mpif77) 800 20 11 4735 1:18:54 2jun07 64CoreQueue g77(mpif77) 800 40 21 4025 1:07:04 1jun07 64CoreQueue To top of page
zeus.db.erau.edu (head of 128-node linux cluster) 128 compute nodes: dual Xeon 3.2GHz 1024 KB cache 4GB mem with Myrinet, Lava scheduler, GNU Linux uname -a: Linux 2.6.9-11.ELsmp #1 SMP x86_64 x86_64 x86_64 GNU/Linux g77 -v: gcc version 3.4.3 20050227 (Red Hat 3.4.3-22.1) g77 -O3 -finit-local-zero -Wno-globals jun07 Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- gnu(mpif77) 200 1 1 24260 6:44:19 27jun07 gnu(mpif77) 200 2 2 12553 3:29:13 26mar07 gnu(mpif77) 200 5 3 5812 1:36:51 27jun07 gnu(mpif77) 200 10 6 3866 1:04:25 27jun07 gnu(mpif77) 200 20 11 2945 0:49:04 27jun07 gnu(mpif77) 200 40 21 2767 0:46:07 27jun07 gnu(mpif77) 400 2 2 24782 6:53:02 27jun07 gnu(mpif77) 400 5 3 10705 2:58:25 27jun07 gnu(mpif77) 400 10 6 6241 1:44:00 27jun07 gnu(mpif77) 400 20 11 4267 1:11:07 27jun07 gnu(mpif77) 400 40 21 4183 1:09:42 27jun07 gnu(mpif77) 800 2 2 51813 14:23:32 28jun07 gnu(mpif77) 800 5 3 20771 5:46:10 28jun07 gnu(mpif77) 800 10 6 11066 3:04:25 27jun07 gnu(mpif77) 800 20 11 6808 1:53:27 27jun07 gnu(mpif77) 800 40 21 4111 1: 8:30 27jun07 To top of page
tiger.ornl.gov (head of 72-node Cray XD1 linux cluster) 70 compute nodes: dual Opteron 248, 8GB memory Cray RapidArray Interconnect (Hypertransport). LSS synchronizes nodes with global clock and co-schedules processes to avoid latency in global communication. Linux ch328-n6 2.6.5_H_01_04 #39 SMP x86_64 x86_64 GNU/Linux pgf95 -V: pgf95 7.0-2 64-bit target on x86-64 Linux pgf95 -fast -O3 -fastsse dec07 dec07 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- pgf90(mpif90) 200 2 2 11541 3:12:21 18dec07 pgf90(mpif90) 200 5 3 4708 1:18:28 18dec07 pgf90(mpif90) 200 10 6 2813 0:46:53 18dec07 pgf90(mpif90) 200 20 11 1551 0:25:50 18dec07 pgf90(mpif90) 200 40 21 1429 0:23:49 18dec07 pgf90(mpif90) 400 2 2 23344 6:29:03 18dec07 pgf90(mpif90) 400 5 3 9257 2:34:16 18dec07 pgf90(mpif90) 400 10 6 4608 1:16:47 18dec07 pgf90(mpif90) 400 20 11 3031 50:30 18dec07 pgf90(mpif90) 400 40 21 1410 23:30 18dec07 pgf90(mpif90) 800 2 2 over limit 18dec07 pgf90(mpif90) 800 5 3 18512 5:08:32 18dec07 pgf90(mpif90) 800 10 6 9258 2:34:17 18dec07 pgf90(mpif90) 800 20 11 5009 1:23:28 18dec07 pgf90(mpif90) 800 40 21 2482 41:22 18dec07 g77 -v: gcc version 3.3.3 (SuSE Linux) g77 -O3 -Wno-globals -funroll-loops dec07 dec07 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- g77(mpif77) 200 2 2 19809 5:30:08 18dec07 g77(mpif77) 200 5 3 7868 2:11:07 19dec07 g77(mpif77) 200 10 6 4041 1:07:20 19dec07 g77(mpif77) 200 20 11 2196 36:35 19dec07 g77(mpif77) 200 40 21 1304 21:44 19dec07 g77(mpif77) 400 2 2 39108 10:51:48 19dec07 g77(mpif77) 400 5 3 15550 4:19:10 19dec07 g77(mpif77) 400 10 6 7951 2:12:31 19dec07 g77(mpif77) 400 20 11 4192 1:09:52 19dec07 g77(mpif77) 400 40 21 2251 37:31 19dec07 g77(mpif77) 800 2 2 wallclock>12 g77(mpif77) 800 5 3 31145 8:39:05 20dec07 g77(mpif77) 800 10 6 mpiexec error g77(mpif77) 800 20 11 8014 2:13:33 19dec07 g77(mpif77) 800 40 21 mpiexec error To top of page
zeus.math.edu (head of 52-cpu Linux cluster) upgraded aug2009 head+2 nodes of dual Quad-Core AMD Opteron 2376 2.3GHz 2GB/node plus 15 dual Opteron 252 nodes 2GB/node uname -a: head.bw01.math.utk.edu 2.6.18-128.2.1.el5 #1 SMP x86_64 ifort -V: Version 11.1 Build 20090630 ID: l_cprof_p_11.1.046 compiled with: ifort -fast -O3: feb10 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- ifort(mpif90) 200 2 hn3 5589 1:33:08 7feb10 ifort(mpif90) 200 5 1 4903 1:21:43 7feb10 ifort(mpif90) 200 10 2 8556 2:22:36 7feb10 ifort(mpif90) 200 20 3 9841 2:44:01 7feb10 feb10 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- ifort(mpif90) 400 2 hn3 10863 3:01:02 7feb10 ifort(mpif90) 400 5 1 6841 1:54:01 7feb10 ifort(mpif90) 400 10 2 9649 2:40:49 7feb10 ifort(mpif90) 400 20 3 10593 2:56:33 8feb10 feb10 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- ifort(mpif90) 800 2 hn3 21418 5:56:57 7feb10 ifort(mpif90) 800 5 1 11035 3:03:55 7feb10 ifort(mpif90) 800 10 2 11836 3:17:15 7feb10 ifort(mpif90) 800 20 3 12192 3:23:12 8feb10 To top of page
midtown.uthsc.edu (head of 56-cpu Linux cluster) installed nov2009 7 nodes of dual Quad-Core AMD Opteron 2376 2.3GHz 2GB/node uname -a: midtown.bw01.uthsc.edu 2.6.18-164.9.1.el5 #1 SMP x86_64 ifort -V: Version 11.1 Build 20091130 ID: l_cprof_p_11.1.064 compiled with: ifort -fast -O3 feb10 runs Ndisc Np nodes CPUs hh:mm:ss date ----- -- ----- ----- -------- ------- ifort(mpif90) 200 To top of page
darter.nics.tennessee.edu Cray XC30 (Cascade) supercomputer 724 compute nodes, each with 16 cores, 32 GB of memory. Cores: 2.6 GHz 64bit Intel XEON E5-2600 Peak performance of 240.9 TF Cray Aries router (8GB/sec bandwidth) torque/4.2.9 , moab/7.2.9 scheduler, PBS runs with module PrgEnv-cray/5.2.40: crayftn mpich Ndisc nodes Np CPUs hh:mm:ss date efficiency ----- ----- -- ----- -------- ------- ---------- crayftn 200 1 2 3080 0:51:20 7sep15 1 crayftn 200 1 5 1374 0:22:54 7sep15 0.897 crayftn 200 1 10 1661 0:27:41 7sep15 0.371 crayftn 200 1 20 550 0:09:10 7sep15 0.560 crayftn 200 3 40 486 0:08:05 7sep15 0.317 crayftn 400 1 5 2662 0:44:22 12sep15 1 crayftn 400 1 10 1467 0:24:27 12sep15 0.907 crayftn 400 2 20 1008 0:16:48 12sep15 0.660 crayftn 400 3 40 1008 0:16:48 12sep15 0.330 no speedup? crayftn 400 1 10 4577 1:16:17 9sep15 strange crayftn 400 2 20 4577 1:16:17 9sep15 strange crayftn 800 1 5 5081 1:24:40 8sep15 1 crayftn 800 1 10 2704 0:45:04 7sep15 0.940 crayftn 800 2 20 1560 0:26:37 7sep15 0.814 crayftn 800 3 40 978 0:16:18 8sep15 0.649 runs with module PrgEnv-intel/5.2.40 ifort mpich Ndisc nodes Np CPUs hh:mm:ss date efficiency ----- ----- -- ----- -------- ------- ---------- ifort 200 1 2 failed 7sep15 ifort 800 1 10 failed 12sep15 at nsteps=6024000 499.9 ifort 800 2 20 failed 7sep15 ifort 800 3 40 failed 7sep15 runs with module PrgEnv-gnu/5.2.40 gfortran mpich Ndisc nodes Np CPUs hh:mm:ss date efficiency ----- ----- -- ----- -------- ------- ---------- gfortran 200 1 10 1928 0:32:08 7sep15 gfortran 400 1 5 3598 0:59:57 12sep15 1 gfortran 400 1 10 1902 0:31:41 9sep15 0.946 gfortran 400 2 20 1158 0:19:18 10sep15 0.777 gfortran 400 3 40 730 0:12:09 12sep15 0.616 To top of page
acf.nics.tennessee.edu (several clusters: beacon, rho, sigma, ...) Each node has 16 cores, 64 GB of memory. Cores: 3.3 GHz 64bit Intel XEON E5-2670 Infiniband router (? GB/sec bandwidth) torque/?.?.? , moab/9.1.1 scheduler, PBS runs with module intel-compilers/2017.2.174 , PE-intel compiled with: mpiifort $(code_f) -c -fPIE -fast mpiifort -pie $(code_o) -o mpiret.x Ndisc nodes Np CPUs hh:mm:ss date efficiency ----- ----- -- ----- -------- ------- ---------- mpiifort 200 1 2 2581 0:43:00 30sep17 mpiifort 200 1 5 1538 0:25:38 30sep17 mpiifort 200 1 10 613 0:10:13 30sep17 mpiifort 200 1 20 541 0:09:01 5oct17 mpiifort 400 1 2 5932 1:38:52 30sep17 mpiifort 400 1 5 2464 0:41:03 30sep17 mpiifort 400 1 10 1326 0:22:05 30sep17 mpiifort 400 1 20 682 0:11:21 4oct17 mpiifort 800 1 3 11685 3:14:45 5oct17 mpiifort 800 1 5 6035 1:40:35 5oct17 mpiifort 800 1 10 2513 0:41:53 5oct17 mpiifort 800 1 21 1213 0:20:12 5oct17

How to find out specs
  • OS, hostname, etc: uname -a
  • CPU, cache: linux : more /proc/cpuinfo alpha : psrinfo -v solaris: /opt/SUNWspro/bin/fpversion irix64 : hinv | grep -e MHZ -e cache aix : sysinfo | grep cache
  • memory : linux : more /proc/meminfo alpha : ulimit -a | grep memory solaris: /usr/sbin/prtconf | grep -i memory irix64 : hinv | grep memory aix : sysinfo | grep memory
  • compiler : linux : f77 -v , pgf90 -V , gcc -v alpha : f95 -version ; cc -V; cxx -V solaris: f95 -V ; /opt/SUNWspro/bin/cc -V irix64 : f90 -version aix : sysinfo | grep xlf : lslpp -i | grep xlf

  • Other benchmarking pages:

  • FsPx Benchmark: alloy solidification
  • MeltFlow Benchmark: Tin melting with flow
  • BenchWeb at netlib
  • MDBNCH: A molecular dynamics benchmark

  • ....... back to V. Alexiades Home Page
    ©2002-2010   V. Alexiades         alexiades(at)utk.edu               Last Updated:   6 Oct 2017