The beamforming pipeline described here is applicable to fine-channelised (10 kHz) MWA data that was produced up to September 2021 (hereafter called legacy data), when the legacy Voltage Capture System was decommissioned. The upgraded system, MWAX (documented elsewhere on this wiki), produces coarse-channelised (1.28 MHz) high-time resolution (HTR) data (hereafter called MWAX data; see the format description here). Apart from the obvious requirement to be able to process the new data format, a number of other factors have motivated the development of VCSBeam, which operates sufficiently differently from the legacy beamformer that it has now branched off into its own repository, housed at this GitHub page.
Processing Pipeline Overview
Processing steps
- Download (not part of VCSBeam): This step is currently identical to the downloading step described on the legacy page. For legacy data, the existing pipeline performs the "recombine step" automatically, so that the data which are made available to the user are the "Recombined voltages", or ".dat" files. As of this writing (25 Sep 2021), it is not yet clear whether the current downloading instructions will correctly download MWAX data (i.e. after September 2021), or whether that functionality still needs to be implemented. However, it is intended that the data that will be downloaded are the "Coarse channel voltages", or ".sub" files.
- Offline PFB: This step mimics the FPGA-based polyphase filter bank that was implemented in the legacy system, but which is not supplied by the MWAX system. It is currently (as of September 2021), the only way to process MWAX data, although there are many outstanding issues with this that are described below.
- Offline correlator: Used for producing correlated visibilities in the form of GPUBox files, e.g., for in-beam calibration, or making fast images.
- Beamformer: This produces the theoretical maximum sensitivity towards a desired look-direction. It incorporates the multi-pixel functionality described in Swainston et al. (in prep). The outputs of the beamformer are either full Stokes PSRFITS files at 100 μs resolution, or dual polarisation (XY) VDIF files at 0.78 μs resolution.
Download
Refer to the legacy page for instructions.
Offline PFB
This step is currently necessary for processing MWAX data, although it is intended to be subsumed into the beamforming step so that any intermediate channelisation that is required as part of the beamforming process is made invisible to the user. This is also important for obviating the need to write out the .dat files to disk, as the pre-beamformed data are sufficiently volumous that even our generous allotment of disk space on Pawsey's systems would be quickly exhausted.
The Offline PFB uses GPUs for the fine PFB operation (NVIDIA/CUDA), and operates on one second of data at a time. The GPUs must have at least 3.5 GB of device memory available. Operating on smaller chunks of data is not (yet) implemented.
A single call to Offline PFB operates on a single coarse channel and an arbitrary number of timesteps, and produces output files in the legacy .dat format. The example SBATCH script below shows it being applied to 600 seconds of data (starting at GPS second 1313388760) for 5 coarse channels. The output files are written to the current working directory.
The Offline PFB uses the same polyphase filter that was used in the legacy system by default (see McSweeney et al. 2020), but alternative filters can be substituted... (TO DO). The filters are always applied on the second boundaries, ... (TO DO)
Example of use on Garrawarla
(This example is intended to show how to pack multiple jobs onto the same compute node, but my testing of this script did not appear to produce the desired parallelisation that I was hoping. At the very least, the user will be able to amend this example so that it requests multiple compute nodes.)
#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks=5
#SBATCH --ntasks-per-node=5
#SBATCH --mem=370gb
#SBATCH --partition=gpuq
#SBATCH --gres=gpu:1
#SBATCH --time=02:00:00
#SBATCH --account=mwavcs
#SBATCH --export=NONE
module use /pawsey/mwa/software/python3/modulefiles
module load vcsbeam/master
module load openmpi-ucx-gpu
srun -N 1 -n 1 fine_pfb_offline -m /path/to/1313388760_metafits.fits -b 1313388760 -T 600 -f 144 -d /path/to/subfiles &
srun -N 1 -n 1 fine_pfb_offline -m /path/to/1313388760_metafits.fits -b 1313388760 -T 600 -f 145 -d /path/to/subfiles &
srun -N 1 -n 1 fine_pfb_offline -m /path/to/1313388760_metafits.fits -b 1313388760 -T 600 -f 146 -d /path/to/subfiles &
srun -N 1 -n 1 fine_pfb_offline -m /path/to/1313388760_metafits.fits -b 1313388760 -T 600 -f 147 -d /path/to/subfiles &
srun -N 1 -n 1 fine_pfb_offline -m /path/to/1313388760_metafits.fits -b 1313388760 -T 600 -f 148 -d /path/to/subfiles &
wait