Legacy Correlator Issues
This page is to capture the general types of issues (with some examples) encountered with the Legacy MWA correlator. Many of these issued had previously gone unnoticed, because of the overly permissive assumptions it made about gpubox aligmnent. With Birli, we see these issues more clearly.
- 1 Diagnostic Tools
- 2 Issue Details
- 2.1 NAXIS2 Mismatch
- 2.1.1 Symptoms
- 2.1.2 Example
- 2.1.3 Potential fixes
- 2.1.4 Historical info
- 2.2 MILLITIM header fixed to zero gives duplicate HDU timestamps
- 2.2.1 Symptoms
- 2.2.2 Example
- 2.2.3 Potential fixes
- 2.3 Tried to move past end of file
- 2.4 HDU timestamps out of sync between gpubox files
- 2.4.1 Symptoms
- 2.4.2 Example
- 2.4.3 Potential fixes
- 2.5 Data ends ahead of or after schedule
- 2.1 NAXIS2 Mismatch
Diagnostic Tools
Astropy / FitsHeader
One of the simplest ways to extract detailed diagnostic information from the metafits and raw gpubox file formats of the legacy correlator is to use the fitsheader command line tool provided by Astropy. This Python library is also useful for correcting erroneous header values so that the files can be correctly processed. To load Astropy on Setonix:
module use py-pip/default
pip install --user astropy
Note that Pawsey policy says don’t use default module versions, but if I provided instructions with a specific version number, then those instructions would need to be updated every year or so and I’m not doing that.
Birli dry run
module use birli/default
birli --dry-run -m *.metafits -- 1*_2*.fits
Issue Details
NAXIS2 Mismatch
Symptoms
number of frequency channels that we would expect based on the value of the FINECHAN header does not match the dimensions of the raw files.
metafits number of frequency channels is 1280/FINECHAN
raw file hdus should have dimensions (number of baselines, nubmer of fine channels).
Example
> fitsheader 1107478352.metafits --keyword FINECHAN --extension 0
# HDU 0 in 1107478352.metafits:
FINECHAN= 40.0 / [kHz] Fine channel width - correlator freq_res
# ^ we would expect there to be 32 fine channels.
> fitsheader 1107478352_20150209005219_gpubox01_00.fits --keyword NAXIS* --extension 0
# HDU 1 in 1107478352_20150209005219_gpubox01_00.fits:
NAXIS = 2 / number of data axes
NAXIS1 = 66048 / length of data axis 1
NAXIS2 = 256 / length of data axis 2
# ^ but there are actually 256 fine channels.
Potential fixes
update the metafits
Historical info
For the old correlator, the NAXIS values represent the actual, working correlator mode when the data was collected, while the metafits file represents what mode the user wanted the correlator to be in. Back in those days, the correlator never saw or knew anything about metafits files - the mode was changed using magic 'mode change' observations, so if one of those mode change observations was accidentally deleted, it would keep taking data in the previous mode. That happened quite often. If there's a discrepancy, you should use those, and fix (or ignore) the metafits file.
MILLITIM header fixed to zero gives duplicate HDU timestamps
Symptoms
Groups of adjacent HDUs in the same gpubox file have the timestamp, according to the combination of the
TIME
andMILLITIME
headers of each HDUOnly visible with with non-integer
INTTIME
(e.g. 0.5s)
Example
Potential fixes
Use a script to update the incorrect
MILLITIM
headers
Tried to move past end of file
Symptoms
first N bytes of a fits file are all nulls
fitsino can't read the file
fitscheck shows error code 1
Example
Context
something corrupted raw file in its journey from the correlator to your filesystem
HDU timestamps out of sync between gpubox files
Symptoms
One or more gpubox files have timestamps that fall in between the timestamps of the other gpuboxes
Only visible with
INTTIME
> 1sNoCommonTimesteps
when processing with MWALib, Birli or Hyperdrive
Example
Potential fixes
Process groups of gpuboxes with common timesteps separately
Data ends ahead of or after schedule
Symptoms
some or all gpubox files have more or less timesteps than specified in the metafits
in severe cases, the data can run out before GOODTIME, meaning all data is considered contaminated.
more or less timestamps that expected when processing with MWALib, Birli or Hyperdrive
Example
each timestamp is the START of the integration, so according to what was scheduled, the last timestamp should have started at GPS 1185651894
it started 1 second late and ended 3 seconds late relative to what was scheduled
observatory recommends flagging 2 seconds (1 timestamp) after scheduled start of observation, so it only flags the first timestamp
birli selects all "common" timestamps, and flags any timestamps that aren't "good", these are mwalib conventions:
common = common to all gpu box files
good = outside of quack time