MWAX Visibility File Format

The MWA Correlator v2 (aka MWAX Correlator) is still under development. This article summarises the significant differences between the legacy (Ord/Current/v1) correlator and the new MWAX correlator

Introduction

Each coarse channel is processed separately on a correlator server (traditionally called a "gpubox"). When operating at full capacity there are 24 coarse channels processed (although the '24' number is arbitrary and can change per observation). Thus there will be multiples of 24 gpubox files produced per observation if all gpuboxes are working. In order to prevent any individual gpubox file / coarse channel file becoming too big, the files may be split into multiple parts. However this split always occurs so as to never break up a sub-observation (8 secs). This splitting is taken care of by the data capture code.

Overview of Changes to Assumptions in the new MWAX correlator

  • The associated metafits file for each observation should be considered the source of truth for everything, except things that vary per coarse channel, which will be in the primary HDU of each coarse channel FITS file. 
    • Only the most basic information such as the obsid, projectid, time, correlator mode are repeated in the FITS Primary HDU (purely for convenience).
  • The number of antenna can change on observation boundaries, so do not assume 128. The new MWAX correlator is designed to support up to 256 tiles, but will likely start at 128 and grow as next generation receivers are added to the array. E.g. the next increment might be 136T if we added one more receiver connected to 8 tiles.
  • Within the FITS image HDUs for visibilities, NAXIS1 represents the fine channels * polarisations * 2 (real/imaginary). NAXIS2 represents the baselines (antennas x (antennas+1))/2
  • Polarisations are in XX,XY,YX,YY order
  • Antennae are ordered by the "antenna" value as per the ‘BINTABLE’ in the associated metafits file called ‘TILEDATA’. See MWAX Antenna Ordering for more details.
  • Baselines are antenna vs antenna, not input vs input (e.g. ant0 vs ant2, not ant0x vs ant2y), with all of the polarisations XX,XY,YX,YY grouped together with that baseline in the visibilities.
  • Baselines are in lower regular triangular order. 0-0..0-n, then 1-1..1-n, then 2-2..2-n, etc.
  • Coarse channel numbers (CORRCHAN) are always in sky frequency ascending order.
  • Real and imaginary data values are 32 bit floats
  • There are significantly more correlator modes to support. Rely on FINECHAN and INT_TIME for the correct mode information from the metafits. See: MWAX Modes
  • The number of coarse channels per observation could change (once we deploy replacement receivers we could choose to use >24, or if we allow astronomers to choose LESS than 24 course channels with the existing receivers, or if an MWAX server is offline), so do not assume 24.

  • Instantaneous bandwidth / coarse channel width could change (once we deploy replacement receivers), so do not assume 30.72 MHz (24 x 1.28 MHz).

  • Weights are provided in the gpubox files after each visibility HDU containing one timestep/integration. The weights and how they are determined is discussed below:
    • Each visibility has a multiplicative weight applied, based on a data occupancy metric that takes account of any input data blocks that are missing due to lost UDP packets or RFI excision (a potential future enhancement).  The centre (DC) ultrafine channel is excluded when averaging and the centre output channel values are re-scaled accordingly.  Note that only 200 Hz of bandwidth is lost in this process, rather than a complete output channel. 
    • As part of the M&C system, the application of weights can be turned on or off per observation based on the science case/needs. With weights not applied, the data will be averaged in the correlator without taking into account the weights. Either way the weights are supplied in each alternate ImgHDU for your information.
  • The MWAX correlator will provide options on a per observation to apply geometric and cable delays. When these are provided, they do not need to be performed by downstream tools such as Cotter or the RTS. See MWAX Metafits Changes.
  • he MWAX FITS files contain a keyword "CORR_VER" which represents the correlator version number. If this keyword is missing, assuming this is a FITS file from the Ord/Legacy correlator. "2" is the value for the MWAX correlator.

Correlator File Naming Convention

Example: 1224396656_20181024T061040_ch201_001.fits

OBSID_YYYYMMDDThhnnss_chCCC_NNN.fits

where:

  • OBSID = observation ID
  • Start date / time of subobservation (UTC) - the date/time is provided by metafits file when the correlator is building the FITS file.
    • YYYY = 4 digit year
    • MM = 2 digit month (zero left padded)
    • DD = 2 digit day (zero left padded)
    • T =  date/time separator
    • hh = hour (24 hr zero left padded)
    • nn = minute (zero left padded)
    • ss = second (zero left padded)
  • CCC = Channel number (0-255)
  • NNN = File number (0...n)

In the example above, the file represents the second file of the coarse channel number 201 for obs_id 1224396656.

FITS Format

The FITS files generated by the new correlator have the following structure:

  • 1 Primary HDU
  • Extension HDU (Visibilities for timestep/integration 0) 
  • Extension HDU (Weights for timestep/integration 0)
  • Extension HDU (Visibilities for timestep/integration 1) 
  • Extension HDU (Weights for timestep/integration 1)
  • ...

FITS Primary HDU

Keyword(MWA) Valid ValuesChange from v1 format?Notes
SIMPLE
T
conforms to FITS standard 
BITPIX
8
array data type
NAXIS
0
number of array dimensions
EXTEND
T

COMMENT
FITS (Flexible Image Transport System) format is defined in 'Astronomy


COMMENT
and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H


CORR_VER
2New

Correlator format version: <missing>= v1 (Ord / Legacy Correlator),

2 = v2 (MWAX Correlator)

COMMENT
Visibilities: 1 integration per HDU: [baseline][finechan][pol][r,i]NewDescription of data format for visibility HDUs
COMMENT
Weights: 1 integration per HDU: [baseline][pol][weight]NewDescription of data format for weight HDUs
MARKER
e.g. 0,1,2... n (where n is the final integration/timestep)
Data offset marker of first HDU (all channels should match). The marker increments by 1 per integration/timestep
TIME
e.g. 1531077040

Unix start time of the data in this file
MILLITIM
0-999
Milliseconds component of TIME
PROJID
e.g. G0000
MWA Project ID
OBSID
e.g. 1215112256
MWA Observation ID (GPS start time of observation)
FINECHAN

0.2, 0.4, 0.8, 1, 1.6, 2, 3.2, 4, 5, 6.4, 8, 10, 12.8, 16, 20, 25.6, 32,

40, 51.2, 64, 80, 128, 160, 256, 320, 640, 1280

NewCorrelator mode: Fine channel width in kHz
NFINECHS

1-6400

e.g. 128 == 10 kHz fine channels (1280 kHz / 10 kHz)

NewNumber of fine channels in each coarse channel
INTTIME

0.25, 0.5, 1.0, 2.0, 4.0, 8.0

Changed

(was only 0.5, 1, 2)

Correlator mode: Integration time (s)
NINPUTS

(2-n) in increments of 16 (due to xGPU limitation)

256 == 128T

288 == 144T

512 == 256T

NewNumber of inputs into the correlation products (antennas * pols)
CORRHOST
e.g. gpubox27New (pseudo-hostname used to be in the filename). NOTE: there are 26 MWAX servers (24 active + 2 spares). The CORRHOST value is for debugging purposes only and should not be used to infer the coarse channel. i.e. there is no implied mapping such that mwax01 gets the first coarse channel. It could be any of the mwax servers based on their readiness/status and other IT related factors!Hostname of the correlator server which processed this coarse channel
CORRCHAN

0-N

e.g. 0-23

New (was in filename as part of the gpubox number). This is the 0-indexed ordinal of the receiver coarse channel selected. It is probably more convenient to use the absolute receiver channel number from the filename (..._chNNN_...). However here is how it works:

If rec channels are: (101,102...124) then that would map to CORRCHAN (0,1,...23)

Due to the RRI receiver channel reversal when receiver channel is >128:

if rec channels are: (157,158,..180) then

that would map to CORRCHAN (23,22...0)

Coarse channel number selected for correlation.
MC_IP
v.x.y.zNewMulticast IP the data was addressed to
MC_PORT
nnnnnnNewMulticast port the data was addressed to
U2S_VER
X.Y.ZNewVersion of the mwax_u2s program that captured the UDP packets for this observation
CBF_VER
X.Y.ZNewVersion of the mwax_db2correlate2db program that performed the F and X stage correlation for this observation
DB2F_VER
X.Y.ZNewVersion of the mwax_db2fits program that wrote this fits file.


FITS Extension Visibility ImgHDU

Visibility Header

This HDU represents the visibilities for one coarse channel for one time integration / time step.

Keyword(MWA) Valid ValuesChange from v1 format?Notes
XTENSION
IMAGE
Image Extension created by MWA DataCapture
BITPIX
-32

Changed:

v1 Pre Oct 2014 = -32 (32 bit float)

v1 Post Oct 2014= 32 (32 bit integer)

v2 = -32 (32 bit float)

number of bits per data pixel (negative is floating point)
NAXIS
2
number of array dimensions
NAXIS1

(coarse channel width / fine_chan_width) * pols(xx,xy,yx,yy) * 2

For 10 kHz fine chan: (1280 / 10) * (4) * 2 == 1024

Changed:

was: baselines * pols * 2

Fine Channels * Polarisations * 2 (real/imag)
NAXIS2
For 128T = (128*129)/2 = 8256
For 256T = (256*257)/2 = 32896

Changed:

was: fine channels

Baselines
PCOUNT
0

number of group parameters
GCOUNT
0

number of groups
COMMENT
FITS (Flexible Image Transport System) format is defined in 'Astronomy


COMMENT
and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H


TIME
e.g. 1531077040

Unix start time of this HDU
MILLITIM
0-999
Milliseconds component of TIME
MARKER
e.g. 0,1,2... n (where n is the final time step)
Data offset marker of HDU (all channels should match). The marker increments by 1 per time step

NOTE: there is no BSCALE used in the v2 correlator format.

Visibility Data

To simplify downstream processing, the correlator will reorder the data produced by xGPU so that the time step / integration is the slowest moving dimension, then within each time step: baseline (in a lower regular triangular order by antenna order from the metafits file) then fine channel (in ascending sky frequency) and polarisation of XX, XY, YX, YY, and finally for each polarisation there will be a real and imaginary floating point value. The ordering of data thus matches closely that used by downstream tools.

time | baseline | freq | pol | r,i

Below is an example using 4 antennas (ant0, ant1, ant2, ant3) example, with 2 fine channels (0,1) and 4 polarisations (xx,xy,yx,yy) in one time step / integration (we only have time step per HDU). The ImgHDU data would like the following:

antAantBch0.xxch0.xych0.yxch0.yych1.xxch1.xych1.yxch1.yy
00r, ir, ir, ir, ir, ir, ir, ir, i
01r, ir, ir, ir, ir, ir, ir, ir, i
02r, ir, ir, ir, ir, ir, ir, ir, i
03r, ir, ir, ir, ir, ir, ir, ir, i
11r, ir, ir, ir, ir, ir, ir, ir, i
12r, ir, ir, ir, ir, ir, ir, ir, i
13r, ir, ir, ir, ir, ir, ir, ir, i
22r, ir, ir, ir, ir, ir, ir, ir, i
23r, ir, ir, ir, ir, ir, ir, ir, i
33r, ir, ir, ir, ir, ir, ir, ir, i

FITS Extension Weights ImgHDU

Weights Header

This HDU represents the weights for one coarse channel for one time integration / time step.

Keyword(MWA) Valid ValuesNotes
XTENSION
IMAGEImage Extension created by MWA DataCapture
BITPIX
-32number of bits per data pixel (negative is floating point)
NAXIS
2number of array dimensions
NAXIS1

pols(xx,xy,yx,yy) == 4

Polarisations
NAXIS2
For 128T = (128*129)/2 = 8256
For 256T = (256*257)/2 = 32896
Baselines
PCOUNT
0
number of group parameters
GCOUNT
0
number of groups
COMMENT
FITS (Flexible Image Transport System) format is defined in 'Astronomy

COMMENT
and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H

TIME
e.g. 1531077040
Unix start time of this HDU
MILLITIM
0-999Milliseconds since TIME
MARKER
e.g. 0,1,2... n (where n is the final time step)Data offset marker of HDU (all channels should match). The marker increments by 1 per time step

Weights Data

As of writing the weights HDU is filled with 1's. In the near future, proper weight values will be supplied from the MWAX correlator.

The weight data is represented by a single-precision floating point value per baseline and polarisation. Below is an example using 4 antenna (ant0, ant1, ant2, ant3) example and 4 polarisations (xx,xy,yx,yy) in one time step / integration (we only have time step per HDU). The ImgHDU data for the weights would look like the following:

antAantBweight.xxweight.xyweight.yxweight.yy
00wwww
01wwww
02wwww
03wwww
11wwww
12wwww
13wwww
22wwww
23wwww
33wwww