Optimizing Python Scripts
Python on HPC can be really slow sometimes. here are some tips
Look at different compile options
https://numba.pydata.org/
Fast Numpy Operations
avoid code where you step through an array in a python loop
incoherent_sum = 0
for bl_idx in baselines:
incoherent_sum += vis_hdu.data.data[bl_idx, ..., 0] + 1j * vis_hdu.data.data[bl_idx, ..., 0]
instead, try to reshape your problem into a multi-dimensional array mask that can be passed into numpy all at once, and use faster numpy math.
data = vis_hdu.data.data[..., 0] + 1j * vis_hdu.data.data[..., 1]
incoherent_sum = np.sum(data[baselines])
Local Storage
Sometimes, reading a fits file from scratch via astropy can be quite slow. You can slightly modify your slurm scripts to speed this up.
It can be faster to copy the file to local storage first, then read the file with astropy. There are two local storage options:
/dev/shm (available space depends on memory capacity of the node)
/nvmetmp (not available on setonix, up to 1TB on Garrawarla)
Shared Memory
/dev/shm is a directory mounted directly in memory, so you need to request enough memory to contain multiple copies of the file.
#!/bin/bash
#SBATCH --mem=100G
...
cd /dev/shm; mkdir deleteme; cd deleteme;
# copy from scratch to local
cp /scratch/foo/bar.uvfits .
# run analysis on local file
python baz.py ./bar.uvfits
# copy results back to scratch
cp baz.out /scratch/foo
NVMe
/nvmetmp space is only provided when requesting --tmp
in slurm.