Calibrate and WSClean Performance on Garrawarla NVMe Drives

The tables below summarize timing tests done using two compute nodes available on Garrawarla (a Hewlett-Packard GPU cluster compute system at the Pawsey Supercomputing Centre) to process and image different MWA observations. Currently the 960GB NVMe drive's that are available per node on Garrawarla are not being utilized to their maximum capacity, but they provide much faster read and write file I/O times. These timing tests were taken to see if there is a significant speed-up for common imaging commands used in processing MWA observations.

MWA observations used in timing tests:

  • EoR MWA Observation - 2 minute observation with 2s timing resolution in the EoR-0 field, ~4GB measurement set
  • IPS MWA Observation - 10 minute observation with 10s timing resolution, ~40GB measurement set
  • GLEAM-X MWA Observation ~15GB measurement set

In general, there isn't a huge difference in the timing when looking at smaller observations, but is definitely more noticeable when the observations start getting larger. Although there is only really a 15-20 minute speedup for the largest of the tested observations, when planning on processing more than a handful such as a couple hundred or even a couple thousands, those minutes start to add up, and thought should be given whether the NVMe drives should be utilized. Another aspect that will greatly improve the run time on the NVMe's is the amount of data (eg, images, output files) that have to be copied back to /astro (or any other storage system), the less transferring back that has to be done, the more useful the NVMe drives become.


Timing for Different MWA Observations


/astro *NVMe with Data Transfer **NVMe without Data Transfer **
EoR MWA Observation24m 14s23m 04s22m 49s
IPS MWA Observation3h 55m 16s ***3h 46m 41s3h 38m 12s
GLEAM-X MWA Observation17m 13s20m 33s13m 29s

*All these timing tests are before the ASVO delivery system update, therefore measurement sets are still downloaded from Acacia
**Data Transfer: For the NVMe tests, the measurement set was downloaded straight to the temporary NVMe directory from Acacia, therefore it is not included in the data transfer timing, as the same process occurred on /astro (before the ASVO /astro delivery system was implemented). The data transfer that is included in the timing is the transfer of the final images produced from the program, and any other output files from the NVMe back to /astro, as the NVMe's are not to be used as long term storage systems.
***There has been a huge improvement in super-computing ability seen in the processing of IPS observations. Around 2020, running the same commands on the IPS observations took ~10 hours on Magnus (another compute system at Pawsey)


Timing Breakdown for EoR MWA Observation


/astroNVMe
Snapshot Calibration2m 31s2m 28s
Standard Full Calibration1m 12s1m 09s
Apply Solutions0m 09s0m 09s
Standard Deep Image WSClean4m 23s3m 42s
Snapshot Images WSClean14m 43s14m 01s

Commands run:

  • Snapshot Calibration - calibration of each individual time-step in an observation

    Snapshot Calibration
    calibrate -minuv 130 -maxuv 1300 -j 38 -t 1 -m {obsid}_model.txt {obsid}.ms {obsid}_multi.bin
  • Standard Calibration - calibration using the whole observation

    Standard Calibration
    calibrate -minuv 130 -maxuv 1300 -j 38 -m {obsid}_model.txt {obsid}.ms {obsid}.bin
  • Apply Solutions - apply full standard calibration to measurement set

    Apply Solutions
    applysolutions -nocopy {obsid}.ms {obsid}.bin
  • Standard Deep Image WSClean - create a single image using the whole observation

    Standard Deep Image WSClean
    wsclean -j 38 -name {obsid} -pol xx,yy -size 2400 2400 -join-polarizations -minuv-l 50 -weight briggs 1.0 -niter 12000 -auto-mask 3 -auto-threshold 2 -nmiter 5 -scale 1amin -log-time -mgain 0.8 {obsid}.ms
  • Snapshot Images WSClean - create an image for each time-step of the whole observation

    Snapshot Images WSClean
    wsclean -j 38 -name {obsid} -subtract-model -pol xx,yy -size 2400 2400 -join-polarizations -minuv-l 50 -weight briggs 1.0 -nwlayers 18 -niter 12000 -auto-mask 3 -auto-threshold 2 -scale 1amin -log-time -no-reorder -no-update-model-required -interval 2 53 -intervals-out 51 {obsid}.ms