...
xGPU places the computed visibilities for each baseline, with 200 Hz resolution (6,400 channels), in GPU memory. A GPU function then performs channel averaging according to the “fscrunch” factor specified in the PSRDADA header, reducing the number of output channels to (6400/fscrunch), each of width (200*fscrunch) Hz. During For example, with fscrunch = 50, there will be 128 output visibility channels of 10 kHz each.
The output visibility channels are "centre symmetric" in terms of how their boundaries are aligned within the coarse channel bandwidth. The centre output fine channel is centred symmetrically on the centre of the coarse channel (as was the case with the legacy correlator/fine PFB). Remaining output channels extend above and below the centre channel symmetrically. In cases where there is an odd number of output channels across the coarse channel, there are full-width channels at the lowest and highest ends of the coarse channel bandwidth. In cases where there is an even number of output channels across the coarse channel, there are half-width channels at the lowest and highest ends of the coarse channel bandwidth. See: MWA Fine Channel Centre Frequencies
The channel averaging process involves the summation of the ultrafine channel visibility values comprising each output visibility fine channel, i.e. the sum of fscrunch complex values. For example, for 10 kHz output visibility channels, the complex visibilities of 50 ultrafine channels are summed to produce each output channel. Note that for the centre output fine channel, the centre (DC) ultrafine channel is excluded from the summation to remove any DC component present in the coarse channel, and the output value is re-scaled accordingly to maintain a consistent output magnitude as other channels. Note that only 200 Hz of bandwidth is lost in this process, rather than a complete output channel (10 kHz in the legacy correlator).
During this averaging process, each visibility can have a multiplicative weight applied, which in future can be based on a data occupancy metric that takes account of any input data blocks that were missing due to lost UDP packets or RFI excision (a potential future enhancement). The centre (DC) ultrafine channel is excluded when averaging and the centre output channel values are re-scaled accordingly. Note that only 200 Hz of bandwidth is lost in this process, rather than a complete output channel (10kHz in the legacy correlator). The averaged output channel data is then transferred back to host memory.The centre output fine channel is centred symmetrically on the centre of the coarse channel (as was the case with the legacy correlator/fine PFB). See: MWA Fine Channel Centre Frequencies
Visibility Re-ordering
xGPU utilizes data ordering that is optimized for execution speed. The visibility set is re-ordered into a more intuitive triangular order by the CPU: [time][baseline][channel][polarization]. The re-ordered visibility sets (one per integration time) are then written to the output ring buffer.
...