Math578 - Alexiades
                    ACF info
  • ACF is UTK's Advanced Computing Facility.
  • Our class ACF Project is "ACF-UTK0151".
  • If you already have ACF account: login to portal.nics.utk.edu and choose option to be added to Project ACF-UTK0151 .
  • If you do not have ACF account: go to https://portal.acf.utk.edu/accounts/request and request "new account", for Project ACF-UTK0151 .

                  Running MPI code on ACF
    Clusters and HPC systems, like ACF, provide environments for running BATCH jobs.
    Resources are loaded by "module", which drastically simplifies the Makefile.
    Running code involves several steps you need to be aware of:
    Login, transfer files, put them in Lustre, compile, submit batch job (via 'qsub' using PBSscript), and wait for it to run...
    ACF consists of several clusters, Beacon, Rho, Sigma, ... each with several nodes (beacon has 43 compute nodes),
    plus several login nodes. Each node has 16 "cores" in 2 "sockets".

  • File systems: Home directories are mounted on login (service) nodes via NFS, but NOT mounted on the compute nodes.
    You MUST run jobs from the Lustre file system, which provides "scratch" space, mounted on all compute nodes.
      The envar $SCRATCHDIR points to your scratch space.
      Create a link to it in your home dir:   ln -s $SCRATCHDIR Scratch   so you can do:  cd Scratch
    Note: Lustre files are NOT backed up, and are deleted after 30 days, should copy important files to your $HOME often.
  • Compilers:.   There are 2 suites:   Intel: mpiicc, mpiicpc, mpiifort , and Gnu: gcc, g++, gfortran.
      They are loaded by module (see below). See module commands   Intel compilers are faster than Gnu usually.
  • Scheduler and PBS: Jobs are submitted (to Torque manager and Moab scheduler) via a "PBSscript".
      Copy the following to a (plain text) file named PBSscript
      For each run, you will need to set: nodes=?:ppn=? ,  walltime ,  jobname ,  −n ?? ,  code.x
    ############ PBSscript for ACF ##########
    #PBS -S /bin/sh
    #PBS -A ACF-UTK0151             #( this is our account number )
    #PBS -l nodes=1:ppn=11		#( requests 11 cores of 1 node )
    #PBS -N name_for_your_job       #( short single_string e.g. J256on11 )
    #PBS -l walltime=00:30:00	#( hh:mm:ss )
    #PBS -j oe
    #PBS -k oe
    cd $PBS_O_WORKDIR		#(points to dir job is submitted from)
    ####------ ACF mpich ------:
    mpirun -n 11 ./code.x  	# < ./dat > ./OUT  (to redirect I/O) 
    ############ end of PBSscript ##########
    
    Submit with: qsub PBSscript
      This will schedule job "J256on11" to be run on 11 cores of one node (when resources become available...).
    Important: The batch system will allocate entire node exclusivelly to you, even if you only use 1 core!
               The more nodes (and cores) you request, the longer it will take for your job to start running...

    Job monitoring commands: qstat -a (qu script, see below),  showq -r ,  checkjob ,  qdel , ...
                Steps for compiling and running code (see Running Jobs )
  • Login to ACF (with your smartphone at hand for Duo...):   ssh -X NetID@duo.acf.tennessee.edu
    [ To get another terminal without going thru Duo:   on ACF type:  nohup xterm -bg black -fg cyan -fn 8x13bold -ls &
       other colors: aquamarine , khaki , peachpuff , seagreen , ... and can reverse -bg with -fg.
       Can create an alias "xt" in your ~/.bashrc : alias xt="nohup xterm ..... " , then:   . ~/.bashrc , then: xt
       Or download this fancier xtloc script into a file "xtloc", and make it executable: chmod u+x xtloc ]
  • On another window on your PC, zip (the dir with your) code into a CODE.zip and scp ACF:
        scp -p CODE.zip NetID@acf-login2.nics.utk.edu:CODE.zip
  • It will go to your $HOME . Copy it to your $SCRATCHDIR:  cp -ip CODE.zip Scratch
  • cd Scratch ;   unzip CODE.zip
  • cd CODE   Make sure you copy "PBSscript" into CODE/.
  • Check what's loaded: module list   several are loaded by default (including Intel compilers).
      (To use Gnu compilers:   module swap PE-intel  PE-gnu)
  • Compile your code. Basically   mpiifort code.f90 -o code.x ...   or mpiicpc code.cpp -o code.x ...
      Better way: In Makefile insert: COMP = mpiifort or COMP = mpiicpc and then: make compile
    Note: To compile with '-fast' optimization, put these in Makefile, and do: make mpifast
    ##............on acf -fast  needs 2-steps:
    mpifast:
            mpiifort $(code_f) -c -fPIE -fast
            mpiifort -pie $(code_o)  -o $(code).x 
  • Edit your PBSscript to customize it for this specific run. Then
  • submit the job:   qsub PBSscript   (or:  make pbs )
  • Check status:   qstat -a | grep $USER
      Better yet, use this   qu script   (put in a file "qu" and make it executable:   chmod u+x qu ).
      The first item displayed is JobID, needed for 'checkjob', 'qdel', ...
  • If a job is running and you want to kill it:   qdel JobID
                  Compile and run your SERIAL code on ACF
  • On your PC, put your Lab3 code (and relevant files) into a dir "SERIAL".
  • Clean it up! Comment out any diagnostics, remove any and all interactive features.
      Only a data file should be read in (simplest way: code.x < dat ). Only the OUTPUT routine should print out
      (and main at the end of the run), only essentials.
      [ For C++ programmers, strongly recommend printing via printf(...) and not via "<<", it's much cleaner... ]
  • Copy the 'PBSscript' and 'qu' scripts into SERIAL .
  • zip -oy SERIAL.zip SERIAL/*  
  • transfer to ACF: scp SERIAL.zip NetID@acf-login2.nics.utk.edu:SERIAL.zip
  • Login on ACF.
  • Copy SERIAL.zip to your Scratch dir, and: unzip SERIAL.zip

    To compile and run with Intel compiler:
  • mv SERIAL SERIAL-intel  (rename it)
  • cd SERIAL-intel ;   module list , should see PE-intel
  • Compile it. Name the executable 'serial-intel.x'
  • Edit PBSscript and set: node=1:ppn=1 , jobname: Jintel ,   -n 1
  • qsub PBSscript
  • ./qu   to see if it's running and the JobID, something like:
      63157   username   Jintel   --   R   00:00:13
  • Hopefully it will run and give you what you expect! Record the timing.

    To compile and run with Gnu compiler:   (no good reason for it, Intel compilers are faster, but anyway...)
  • cd .. (to parent dir) ;   unzip SERIAL.zip ;   mv SERIAL SERIAL-gnu ; cd SERIAL-gnu
  • module swap PE-intel PE-gnu
  • module list   should show PE-gnu
  • Repeat the above, replacing "intel" with "gnu"
          Good luck!

  • Then do the above with your parallelized par1D code !
          Good luck!
                ...if I forgot anything let me know...       last updated on 11mar21
                    Hybrid Computing: MPI + OpenMP
    For the adventurous, the next level up would be   Hybrid Programming with OpenMP and MPI
  • OpenMP tutorial , rather detailed, from LLNL
    Do not confuse "openMPI" (an implementation of MPI) with "OpenMP" (shared memory programming standard)