Currently still in heavy development, but is able to perform direction-independent calibration on a GPU or CPU.
More documentation: https://mwatelescope.github.comio/MWATelescope/mwa_hyperdrive/wikiindex.html
Project homepage: https://github.com/MWATelescope/mwa_hyperdrive
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
---------------------------------- /pawsey/mwa/software/python3/modulefiles ---------------------------------- hyperdrive/chj hyperdrive/v0.2.0-alpha1alpha11 (L,D) |
Load a hyperdrive
module:
Code Block | ||||
---|---|---|---|---|
| ||||
module load hyperdrive # this will load the default version |
hyperdrive
prefers to use the FEE beam when its applicable. The associated beam code (hyperbeam
) requires that the MWA FEE beam file be available at runtime; this is either done manually with a command-line argument to hyperdrive
, or with the MWA_BEAM_FILE
environment variable. garrawarla users typically don't need to worry about this, because hyperdrive
modules automatically set MWA_BEAM
_FILE
.
How do I get started?
Have a look at the help text!
The following is current as of 21 February 2022.
See help text:
Code Block | ||||
---|---|---|---|---|
| ||||
hyperdrive -h # -h could also be --help |
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
hyperdrive 0.2.0-alpha9
https://github.com/MWATelescope/mwa_hyperdrive
Calibration software for the Murchison Widefield Array (MWA) radio telescope
USAGE:
hyperdrive <SUBCOMMAND>
OPTIONS:
-h, --help Print help information
-V, --version Print version information
SUBCOMMANDS:
di-calibrate Perform direction-independent calibration on the input MWA data. See for more
info: https://github.com/MWATelescope/mwa_hyperdrive/wiki/Calibration-usage
simulate-vis Simulate visibilities of a sky-model source list
solutions-convert Convert between calibration solution file formats
solutions-plot Plot calibration solutions
srclist-by-beam Reduce a sky-model source list to the top N brightest sources, given pointing
information
srclist-convert Convert a sky-model source list from one format to another
srclist-shift Shift the sources in a source list. Useful to correct for the ionosphere. The
shifts must be detailed in a .json file, with source names as keys associated with
an "ra" and "dec" in degrees. Only the sources specified in the .json are written
to the output source list
srclist-verify Verify that sky-model source lists can be read by hyperdrive
dipole-gains Print information on the dipole gains listed by a metafits file
|
hyperdrive
is broken up into many subcommands. Each of these have their own help; e.g.
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
hyperdrive-di-calibrate 0.2.0-alpha9
Perform direction-independent calibration on the input MWA data. See for more info:
https://github.com/MWATelescope/mwa_hyperdrive/wiki/Calibration-usage
USAGE:
hyperdrive di-calibrate [OPTIONS] [--] [ARGUMENTS_FILE]
ARGS:
<ARGUMENTS_FILE> All of the arguments to di-calibrate may be specified in a toml or json file. Any CLI arguments
override parameters set in the file
OPTIONS:
-v, --verbosity The verbosity of the program. Increase by specifying multiple times (e.g. -vv). The default is to print
only high-level information
--dry-run Don't actually do calibration; just verify that arguments were correctly ingested and print out high-
level information
-h, --help Print help information
-V, --version Print version information
INPUT FILES:
-d, --data <DATA>... Paths to input data files to be calibrated. These can include a metafits file,
gpubox files, mwaf files, a measurement set and/or uvfits files
-s, --source-list <SOURCE_LIST> Path to the sky-model source list file
--source-list-type <SOURCE_LIST_TYPE> The type of sky-model source list. Valid types are: hyperdrive, rts, woden,
ao. If not specified, all types are attempted
OUTPUT FILES:
-o, --outputs <OUTPUTS>...
Paths to the calibration output files. Supported calibrated visibility outputs: uvfits. Supported calibration
solution formats: fits, bin. Default: hyperdrive_solutions.bin
-m, --model-filename <MODEL_FILENAME>
The path to the file where the generated sky-model visibilities are written. If this argument isn't supplied, then
no file is written. Supported formats: uvfits
--ignore-autos
When writing out calibrated visibilities, don't include auto-correlations
--output-vis-time-average <OUTPUT_VIS_TIME_AVERAGE>
When writing out calibrated visibilities, average this many timesteps together. Also supports a target time
resolution (e.g. 8s). The value must be a multiple of the input data's time resolution. The default is to preserve
the input data's time resolution. e.g. If the input data is in 0.5s resolution and this variable is 4, then we
average 2s worth of calibrated data together before writing the data out. If the variable is instead 4s, then 8
calibrated timesteps are averaged together before writing the data out
--output-vis-freq-average <OUTPUT_VIS_FREQ_AVERAGE>
When writing out calibrated visibilities, average this many fine freq. channels together. Also supports a target
freq. resolution (e.g. 80kHz). The value must be a multiple of the input data's freq. resolution. The default is to
preserve the input data's freq. resolution. e.g. If the input data is in 40kHz resolution and this variable is 4,
then we average 160kHz worth of calibrated data together before writing the data out. If the variable is instead
80kHz, then 2 calibrated fine freq. channels are averaged together before writing the data out
SKY-MODEL SOURCES:
-n, --num-sources <NUM_SOURCES>
The number of sources to use in the source list. The default is to use them all. Example: If 1000 sources are
specified here, then the top 1000 sources are used (based on their flux densities after the beam attenuation)
within the specified source distance cutoff
--source-dist-cutoff <SOURCE_DIST_CUTOFF>
Specifies the maximum distance from the phase centre a source can be [degrees]. Default: 50
--veto-threshold <VETO_THRESHOLD>
Specifies the minimum Stokes XX+YY a source must have before it gets vetoed [Jy]. Default: 0.01
BEAM:
--beam-file <BEAM_FILE> The path to the HDF5 MWA FEE beam file. If not specified, this must be provided by the
MWA_BEAM_FILE environment variable
--unity-dipole-gains Pretend that all MWA dipoles are alive and well, ignoring whatever is in the metafits file
--delays <DELAYS>... If specified, use these dipole delays for the MWA pointing
--no-beam Don't apply a beam response when generating a sky model. The default is to use the FEE beam
CALIBRATION:
-t, --time-average-factor <TIME_AVERAGE_FACTOR>
The number of time samples to average together during calibration. Also supports a target time resolution (e.g.
8s). If this is 0, then all data are averaged together. Default: 0. e.g. If this variable is 4, then we produce
calibration solutions in timeblocks with up to 4 timesteps each. If the variable is instead 4s, then each timeblock
contains up to 4s worth of data
-f, --freq-average-factor <FREQ_AVERAGE_FACTOR>
The number of fine-frequency channels to average together before calibration. If this is 0, then all data is
averaged together. Default: 1. e.g. If the input data is in 20kHz resolution and this variable was 2, then we
average 40kHz worth of data into a chanblock before calibration. If the variable is instead 40kHz, then each
chanblock contains upto 40kHz worth of data
--timesteps <TIMESTEPS>...
The timesteps to use from the input data. The timesteps will be ascendingly sorted for calibration. No duplicates
are allowed. The default is to use all unflagged timesteps
--uvw-min <UVW_MIN>
The minimum UVW length to use. This value must have a unit annotated. Allowed units: λ, kλ, l, kl, lambda, klambda,
m, km. Default: 50λ
--uvw-max <UVW_MAX>
The maximum UVW length to use. This value must have a unit annotated. Allowed units: λ, kλ, l, kl, lambda, klambda,
m, km. No default.
--max-iterations <MAX_ITERATIONS>
The maximum number of times to iterate when performing "MitchCal". Default: 50
--stop-thresh <STOP_THRESH>
The threshold at which we stop iterating when performing "MitchCal". Default: 1e-8
--min-thresh <MIN_THRESH>
The minimum threshold to satisfy convergence when performing "MitchCal". Even when this threshold is exceeded,
iteration will continue until max iterations or the stop threshold is reached. Default: 1e-4
--array_longitude <ARRAY_LONGITUDE_DEG>
The Earth longitude of the instrumental array [degrees]. Default (MWA): 116.67081523611111°
--array_latitude <ARRAY_LATITUDE_DEG>
The Earth latitude of the instrumental array [degrees]. Default (MWA): -26.703319405555554°
--cpu
Use the CPU for visibility generation. This is deliberately made non-default because using a GPU is much faster
FLAGGING:
--tile-flags <TILE_FLAGS>...
Additional tiles to be flagged. These values correspond to either the values in the "Antenna" column of HDU 2 in
the metafits file (e.g. 0 3 127), or the "TileName" (e.g. Tile011)
--ignore-input-data-tile-flags
If specified, pretend that all tiles are unflagged in the input data
--ignore-input-data-fine-channel-flags
If specified, pretend all fine channels in the input data are unflagged
--fine-chan-flags-per-coarse-chan <FINE_CHAN_FLAGS_PER_COARSE_CHAN>...
The fine channels to be flagged in each coarse channel. e.g. 0 1 16 30 31 are typical for 40 kHz data. If this is
not specified, it defaults to flagging 80 kHz (or as close to this as possible) at the edges, as well as the centre
channel for non-MWAX data
--fine-chan-flags <FINE_CHAN_FLAGS>...
The fine channels to be flagged across the whole observation band. e.g. 0 767 are the first and last fine channels
for 40 kHz data
RAW MWA DATA:
--pfb-flavour <PFB_FLAVOUR> The 'flavour' of poly-phase filter bank corrections applied to raw MWA data. The
default is 'empirical'. Valid flavours are: empirical, levine, none
--no-digital-gains When reading in raw MWA data, don't apply digital gains
--no-cable-length-correction When reading in raw MWA data, don't apply cable length corrections. Note that some data
may have already had the correction applied before it was written
--no-geometric-correction When reading in raw MWA data, don't apply geometric corrections. Note that some data
may have already had the correction applied before it was written
USER INTERFACE:
--no-progress-bars When reading in visibilities and generating sky-model visibilities, don't draw progress bars
|
DI calibration
Available with hyperdrive di-calibrate
Two main things are required to calibrate visibilities:
- Raw data (gpubox files or MWAX ch??? files) or data container (measurement set or uvfits); and
- A sky-model source list.
Discussion on the source lists and the applicable formats can be found here.
By default, hyperdrive
will attempt to use all sources in the source list file. If there are more than 1,000 sources in the file, then it may take a long time if you're not using a GPU. In order to keep the number of sources used low, one could use the -n
/--num-sources
and/or --veto-threshold
flags, or use a source list with fewer sources in the first place (see hyperdrive srclist-by-beam
).
...
module load hyperdrive/chj # load CHJ's development version |
Example Slurm script
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash -l #SBATCH --job-name=hyp-$1 #SBATCH --output=hyperdrive.out #SBATCH --nodes=1 #SBATCH --ntasks-per-node=40 #SBATCH --time=01:00:00 #SBATCH --clusters=garrawarla #SBATCH --partition=gpuq #SBATCH --account=mwaeor #SBATCH --export=NONE #SBATCH --gres=gpu:1,tmp:50g #SBATCH --gres=gpu:1 module use /pawsey/mwa/software/python3/modulefiles module load hyperdrive set -eux whichcommand -v hyperdrive cd /astro/mwaeor/MWA/data/1090008640 # MakeGet acalibration sourcesolutions. listUse ifthe ittop isn't already there if [[ ! -r srclist_1000.yaml ]]; then hyperdrive srclist-by-beam -n 1000 -m *.metafits 1000 sources. hyperdrive di-calibrate \ -s /pawsey/mwa/software/python3/srclists/master/srclist_pumav3_EoR0aegean_fixedEoR1pietro+ForA_phase1+2.txt \ srclist_1000.yaml fi hyperdrive di-calibrate -s srclist_1000.yaml-n 1000 \ -d *gpubox*.fits *.metafits |
Writing out calibrated visibilities
hyperdrive can write out calibrated visibilities, but only what was read in for calibration. This means that any omitted timesteps are also omitted in the output. Soon, a solutions-apply subcommand will allow any solutions file to be applied to any input data.
The output calibrated visibilities can also be averaged in time and frequency (by multiples of the input resolution or to a target quantity).
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
# Write solutions to "hyp_sols.fits" and calibrated vis to "hyp_cal.uvfits" hyperdrive di-calibrate -s srclist_1000.yaml \ -d *gpubox*.fits *.metafits \ *.mwaf \ -o hyp_sols.fits hyp_cal.uvfits \ --output-vis-time-average 2 \ --output-vis-freq-average 80kHz |
Plotting calibration solutions
Any DI solutions files that are compatible with hyperdrive
(André's output from calibrate
and RTS
) can be plotted directly with hyperdrive
. If using a supercomputer, there's no need to run the job in the queue; it's fast enough to just run it on the login node. It's also good to plot with the corresponding metafits file to get more information:
Code Block | ||||
---|---|---|---|---|
| ||||
hyperdrive solutions-plot -m *.metafits hyp_sols.fits |
If you want to do more analysis with Python, this code reads and plots the hyperdrive
format:
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
#!/usr/bin/env python
import sys
import numpy as np
from astropy.io import fits
import matplotlib.pyplot as plt
if len(sys.argv) == 1:
filename = "hyp_sols.fits"
else:
filename = sys.argv[1]
f = fits.open(filename)
data = f[1].data
# Only looking at the first timeblock.
i_timeblock = 0
data = data[i_timeblock, :, :, ::2] + data[i_timeblock, :, :, 1::2] * 1j
# Uncomment if you want to divide by a reference.
# i_tile_ref = -1
# refs = []
# for ref in data[i_tile_ref].reshape((-1, 2, 2)):
# refs.append(np.linalg.inv(ref))
# refs = np.array(refs)
# j_div_ref = []
# for tile_j in data:
# for (j, ref) in zip(tile_j, refs):
# j_div_ref.append(j.reshape((2, 2)).dot(ref))
# data = np.array(j_div_ref).reshape(data.shape)
# Amps
amps = np.abs(data)
_, ax = plt.subplots(8, 16, sharex=True, sharey=True)
# Uncomment if you want to manually set the y-limit
# ax[0, 0].set_ylim(0, 2)
for i in range(128):
ax[i // 16, i % 16].plot(amps[i, :, 0].flatten()) # XX
ax[i // 16, i % 16].plot(amps[i, :, 3].flatten()) # YY
plt.show()
# Phases
phases = np.rad2deg(np.angle(data))
_, ax = plt.subplots(8, 16, sharex=True, sharey=True)
ax[0, 0].set_ylim(-180, 180)
for i in range(128):
ax[i // 16, i % 16].plot(phases[i, :, 0].flatten()) # XX
ax[i // 16, i % 16].plot(phases[i, :, 3].flatten()) # YY
plt.show() |
Planned features
As hyperdrive
is still in heavy development, not all features are currently available. An indication of what is available is below.
- Reads raw MWA data
- Reads a single uvfits file as input
- Reads multiple uvfits files as input
- Reads a single measurement set file as input
- Reads multiple measurement set files as input
- Calibrates on the CPU
- Calibrates on a GPU
- Writes calibration solutions to the "André Offringa calibrate format"
- Writes calibration solutions in the "RTS format"
- Writes calibrated visibilities directly to uvfits output
- Writes calibrated visibilities directly to measurement set output
# Apply the solutions and write out a measurement set.
# Write it to /nvmetmp as that's much faster than /astro.
hyperdrive solutions-apply \
-d *gpubox*.fits *.metafits *.mwaf \
-s hyp_sols.fits \
-o /nvmetmp/hyp_calibrated.ms \
--time-average 8s \
--freq-average 80kHz
# Move the measurement set to /astro.
mv /nvmetmp/hyp_calibrated.ms . |
This example script reserves 50 GB of space for node local storage (/nvmetmp
). If your output visibilities are bigger than this, then the write will fail; you should adjust the #SBATCH --gres=gpu:1,tmp:50g
line to account for this, e.g. #SBATCH --gres=gpu:1,tmp:200g