Table of Contents | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Cotter is the name of André Offringa's pre-processing pipeline for MWA data. A description of cotter can be found in: The Low-Frequency Environment of the Murchison Widefield Array: Radio-Frequency Interference Analysis and Mitigation , Offringa et al., 2015, PASA, 32, e008. Cotter is was used by the MWA ASVO to run conversion jobs and is was also used by various MWA science teams to pre-process raw visibilities before passing the data onto their pipelines.
Note |
---|
Cotter is no longer actively supported and does not support the new MWAX data format (introduced late 2021) and has several known bugs and issues. Please use current, supported MWA pre-processing tool Birli (MWA ASVO uses Birli under-the-hood for conversion jobs). |
Features
The most important actions Cotter performs are (in order):
Apply mapping from input nr to tile number/polarisation index and from coarse channel to frequency
Correct the phases for the cable length delay
Calculate u,v,w coordinates and other required meta data
Shift the phases from zenith to the pointing centre
Correct for digital gains of the coarse channels and (try to) correct the coarse-channel filter shapes created by the poly-phase filters.
Flag invalid channels (the DC channels and the outer most channels of each coarse channel) and the first 4 seconds (corrupt because of delay in tile initialization)
RFI detection (for which it uses the AOFlagger library)
Calculate some statistics
(Optionally) average in time or frequency direction, calculate weights
Cotter is was primarily used to convert the raw output of the MWA telescope ("gpubox files" containing visibilities) into an intermediate format for calibration; either the CASA measurement set or UVFITS. Bear in mind that André's "calibrate" program requires measurement sets for calibration. It is possible to convert UVFITS files to a measurement set format via CASA.
...
Just run "cotter" to see the up-to-date help information. This is also a (crude) way to determine if Cotter is installed and working. A more complicated example:
Code Block | ||
---|---|---|
| ||
OBSID=1065880128 |
...
MEASUREMENT_SET=$OBSID.ms |
...
TIME_AVERAGE=4 |
...
FREQ_AVERAGE=40 |
...
EDGEWIDTH=160 |
...
MAX_MEMORY_GIGABYTES=40 |
...
cotter -absmem $MAX_MEMORY_GIGABYTES \ |
...
-o "$MEASUREMENT_SET" \ |
...
-m "$OBSID".metafits \ |
...
-timeres $TIME_AVERAGE \ |
...
-freqres $FREQ_AVERAGE \ |
...
-edgewidth $EDGEWIDTH \ |
...
-allowmissing \ |
...
./??????????_*gpubox*.fits |
After you have created a measurement set with Cotter, you can view its associated statistics with the 'aoqplot' tool (part of the AOFlagger package). You can run it like:
Code Block | ||
---|---|---|
| ||
aoqplot preprocessed.ms |
...
You can combine the statistics of multiple sets by specifying multiple measurement sets on the command line, e.g.:
Code Block | ||
---|---|---|
| ||
aoqplot *.ms |
Compression
Cotter4 can make use of 'Dysco' compression, which I'll describe briefly here. To make use of compression, Dysco needs to be separately installed, available at https://github.com/aroffringa/dysco . On Github you can also find some info on the Dysco Wiki and there's a paper describing the technique. Compression is only used when explicitly requested. The Dysco software is not needed when running Cotter without requesting compression. One can request compression by adding '-use-dysco' to the commandline. A run with default compression settings looks like:
Code Block | ||
---|---|---|
| ||
cotter4 -use-dysco -m 1077974936_metafits_ppds.fits -timeres 4 -freqres 80 *gpubox*.fits |
...
The advanced compression parameters can be set with a call like this:
Code Block | ||
---|---|---|
| ||
cotter4 -use-dysco -dysco-config 12 12 TruncatedGaussian 1.5 RF -m 1077974936_metafits_ppds.fits -timeres 4 -freqres 80 *gpubox*.fits |
...
This uses the following settings: use 12 bits per datafloat, 12 bits per weight float, optimize for a Gaussian distribution truncated at 1.5 sigma and use row-frequency normalization -- see cotter4 options & Dysco wiki for more info about these).
...
Example code for reading the flag files in python:
Code Block | ||
---|---|---|
| ||
#!/usr/bin/env python
import pyfits,numpy
hdulist=pyfits.open('1078413440_01.mwaf')
# binary table is in 2nd HDU, first HDU is just metadata
d = hdulist[1].data['FLAGS']
nchan=hdulist[0].header['NCHANS']
nant =hdulist[0].header['NANTENNA']
ntime=hdulist[0].header['NSCANS']
nbl = nant*(nant+1)/2
# shape of data is (ntimes*nbaselines,nchan)
# where nbaselines includes autos. Time index varies most slowly
dr = d.reshape(ntime,nbl,nchan)
# dump first time step of flags to file
dr[0,:,:].tofile('/tmp/dumppy.dat')
# get the mean flagging per antenna
dbaseline = dr.mean(axis=2).mean(axis=0)
# this will have nant x nant entries, including
# flags for the autocorrelations
dtile=numpy.zeros((nant,nant))
k=0
for i in xrange(nants):
for j in xrange(i,nants):
dtile[i,j]=dbaseline[k]
dtile[j,i]=dbaseline[k]
k+=1 |
...
References
The following conference paper gives a broad overview on the steps taken in AOFlagger. The numbers in it are a bit outdated, but the pipeline is still the same and helpful to get a first impression: - "A LOFAR RFI detection pipeline and its first results" http://arxiv.org/abs/1007.2089
The flagger contains two algorithmic steps which differentiates it from other flaggers. The first one is the sumthreshold algorithm, which is a combinatorial thresholding algorithm that looks for lines in the field. This is combined with a high-pass filter to not flag on the signal. Both background and sumthreshold algorithms are described and tested in this paper: - "Post-correlation radio frequency interference classification methods" http://arxiv.org/abs/1002.1957 There's also a technical report on how to implement the SumThreshold? in a fast vectorized way: http://www.astro.rug.nl/~offringa/SumThreshold.pdf
...
Cotter used to use values that I measured using a 32-tile commissioning array calibrator scan, back in 2012, by stacking subbands and normalizing the resulting passband. We are currently (April 2014) planning to change this to exact values derived from the pfb coefficients by Alan Levine. These are his memo and his gains:
sim2n_totpwr_dat.txt: Alan Levine's bandpass gains used to create his relative power figure (dead link)
MwaPfbProtoFilterCoeff2009_512x8.dat - Filter coefficients used to calculate the PFB gains
The first column in sim2n_totpwr_dat.txt is the fine frequency channel, and the third column is the bandpass gains in db. Cotter takes relative power, so a conversion must be used to get the bandpass gains in the right format.