Offline PFB
*** WARNING: Must use VCSBeam >= v2.18 and mwalib >= v0.11.0 ***
This step is currently necessary for processing MWAX data, although it is intended to be subsumed into the beamforming step so that any intermediate channelisation that is required as part of the beamforming process is made invisible to the user. This is also important for obviating the need to write out the .dat files to disk, as the pre-beamformed data are sufficiently volumous that even our generous allotment of disk space on Pawsey's systems would be quickly exhausted.
The Offline PFB implements the weighted overlap add algorithm described in McSweeney et al. (2020).. It uses GPUs for the fine PFB operation (NVIDIA/CUDA), including cuFFT for the Fourier Transform step, and operates on one second of data at a time. The GPUs must have at least 3.5 GB of device memory available. Operating on smaller chunks of data is not (yet) implemented.
A single call to Offline PFB operates on a single coarse channel and an arbitrary number of timesteps, and produces output files in the legacy .dat format. The example SBATCH script below shows it being applied to 600 seconds of data (starting at GPS second 1313388760) for 5 coarse channels. The output files are written to the current working directory.
The Offline PFB uses the same polyphase filter that was used in the legacy system by default (see McSweeney et al. 2020), but alternative filters will be made available in the future. The filters are always applied on the second boundaries, and the tap size is determined from the length of the filter and the number of desired output channels. No attempt is made apply any time or phase adjustments to the voltages either before or after the PFB is applied.
The default mode of the PFB, and the FINEPFB filter, are described in the beginning of McSweeney et al. (2020).
Command line options
Example of use on Garrawarla
(This example is intended to show how to pack multiple jobs onto the same compute node, but my testing of this script did not appear to produce the desired parallelisation that I was hoping for. At the very least, the user will be able to amend this example so that it requests multiple compute nodes.)