Running CHIPS

This page is under active development - scroll down to the second section if you want to run CHIPS on OzStar. Updated Instructions for Garrawarla will soon be added - J. Line  

Flow

Chips grids a uvfits file, producing multiple .dat that can be used for crosspower and 1d power spectrum analysis.

The naming convention for .dat files usually contains the polarisation (xx or yy) and unique extension (ext) to identify that file.

.dat files from multiple uvfits files can be combined together.

arguments:

  • ext - unique name for the grid
  • eorband - either 0 (low, 139-170MHz), 1 (high, 167-198MHz)
  • eorfield - either 0 (radec=0,-27 deg), 1 (radec=60,-30 deg)
  • nchans - number of channels to analyse
  • freq_idx_start - first channel number to analyse
  • pol - polarization to analyse (xx or yy)
  • period - integration time in seconds (e.g. 8.0)
  • chanwidth - frequency resolution in Hz (e.g. 80000)
  • lowfreq - first frequency in the data in Hz
  • umax - max uvw (default 300)
  • addsub - add (0) or subtract (1)
  • nbins - number of power spectra bins (e.g. 80)
  • bias_mode - one of 0/10/11/12 e.g. 0

environment variables:

  • DATADIR
  • INPUTDIR - where to look for input files
  • OUTPUTDIR - where resulting files are written to
  • OBSDIR
  • OMP_NUM_THREADS - number of threads to use

binaries

  • gridvisdiff - calculate the beam weights for an MWA observation
  • prepare_diff - combine data over frequency
  • combine_data - combine data over multiple gridded sets
  • lssa_fg_simple - compute the LS spectral power (no kriging) using diff, tot and weights uvf binary files

Creating grids

gridvisdiff $uvfits $obsid $ext $eorband -f $eorfield
# for pol in xx yy; do
prepare_diff $ext $nchans $freq_idx_start $pol $ext $eorband -p $period -c $chanwidth -n $lowfreq -u umax
# this produces {vis_tot,vis_diff,noise_tot,noise_diff,weights}_${pol}_${ext}.dat in $OUTPUTDIR

Combining grids

combine grids with extensions ext1, ext2 together into a grid with extension $group_ext

# for pol in xx yy; do
# create a file containing all the exts to combine together prefixed with "${pol}."
export combinelist=combine_${pol}.${group_ext}.txt 
for ext in \
    "ext1" \
    "ext2" \
; do
    echo ${pol}.${ext} | tee -a ${combinelist}
done
# ensure {vis_tot,vis_diff,noise_tot,noise_diff,weights}_${pol}.${ext}.dat all exist in $OUTPUTDIR

combine_data ${combinelist} $nchans "${pol}.${group_ext}" $addsub
# this produces {crosspower,residpower,residpowerimag,totpower,flagpower,fg_num,outputweights}_${pol}_${bias_mode}.${ext}.dat in $OUTPUTDIR

Compute Power Spectra

# for pol in xx yy; do
lssa_fg_simple $ext $nchans $nbins $pol $maxu $ext $bias_mode $eorband


Plotting power spectra

see: https://github.com/JLBLine/plot_CHIPS

Garrawarla (new)

# load the chips module
module use /astro/mwaeor/software/modulefiles
module load chips/cmt

# optional optimization: do everything in nvmetmp, need to request enough --tmp from slurm
mkdir /nvmetmp/deleteme
cd /nvmetmp/deleteme

export DATADIR="$PWD"
export INPUTDIR="$PWD/"
export OUTPUTDIR="$PWD/"
export OBSDIR="$PWD/"

Garrawarla (old)

The following is deprecated as chips has moved.


CHIPS is available on Garrawarla

/astro/mwaeor/MWA/chips


Plotting scripts in Python available in /astro/mwaeor/MWA/chips:

plotchips_1d.py - 1D PS

plotchips.py - 2D PS

plotchipsratio.py - ratio in 2D of two different datasets



External cloud processing pared down version of CHIPS (v2.8)

CHIPS version for external computing

Full version CHIPS is a three-step data analysis suite:

Grid visibilities over time as a function of frequency (parallel over frequency with OpenMP)
Combine frequencies into a data cube
Perform spectral Fourier Transform to form power

This version of CHIPS includes Steps 1 and 2 only

CHIPS software:

ANSI C, compiled with gcc, using OpenMP parallelisation
dependencies required: CFITSIO, LAPACK, BLAS, OpenMP, LIBSLA (astronomy)
file input: UVFITS
file outputs: complex floats, doubles (binary files, little endian)

USAGE:

Example usage: This example has 384 frequency channels split over 24 input uvfits files. Each channel is gridded separately, and then these frequencies are combined into a cube with prepare_diff:

#!/bin/bash

source env_variables.sh

ext='test_highband_eor0'

for i in 1 2 3 4 5 6 7 8 9

do

./gridvisdiff $OUTPUTDIR/uvdump_0${i}.uvfits 1160576480 ${ext} 1 -f 0 -e 1

echo ${i}

done

for i in 10 11 12 13 14 15 16 17 18 19 24

do

./gridvisdiff $OUTPUTDIR/uvdump_${i}.uvfits 1160576480 ${ext} 1 -f 0 -e 1

echo ${i}

done

./prepare_diff ${ext} 384 0 'yy' ${ext} 1 -p 2.0 -n 167.075e6 ./lssa_fg_thermal ${ext} 384 80 'yy' 300. ${ext} 0 1 0

IMPORTANT For data with Nchan neq 384, kriging cannot be used. Instead, run with kriging off. E.g., for 120 channels:

./prepare_diff ${ext} 384 0 'yy' ${ext} 1 -p 2.0 -n 167.075e6 ./lssa_fg_thermal ${ext} 120 80 'yy' 300. ${ext} 1 1 0

COMMAND LINE SWITCHES AVAILABLE FOR NON-STANDARD MODES:

fprintf(stderr,"Usage: %s <options> uvfits_filename obs_id extension band \n",argv[0]);
fprintf(stderr,"\t -p period (seconds)\n");
fprintf(stderr,"\t -ew flag 14m EW baselines\n");
fprintf(stderr,"\t -c chanwidth (Hz)\n");
fprintf(stderr,"\t -n lowest frequency (Hz)\n");
fprintf(stderr,"\t -field fieldnum (0=EoR0, 1=EoR1, >2 anything)\n");
fprintf(stderr,"\t -u umax\n");
exit(1);

Running on OzStar


CHIPS can be launched via a python wrapper, which generates and launches sbatch  scripts for you, all with the correct dependencies. Here is a working example with explanations.

The very first thing you have to do is prepare a spot for outputs. CHIPS creates a large number of output files; to make it easier to track, it's suggested each user creates their own output directoy. CHIPS is hard-coded to require a few files in the output directory, so the first thing you have to do is ensure they're in your output directory:

Setup output directory
##cd into where I want to store CHIPS outputs
$ cd /fred/oz048/jline/test_CHIPS_scripts

##make a directory to store the outputs, and soft link necessary files
$ mkdir -p CHIPS_outputs
$ ln -s /fred/oz048/MWA/CODE/CHIPS/TEMPLATE_CHIPS_OUT/*.dat CHIPS_outputs

Next, we need to understand the inputs into CHIPS. The primary inputs are RTS-style uvfits files, which are labelled like uvdump_01.uvfits , one for each coarse band (24 in total). Typically, the outputs are stored in a generic data directory /fred/oz048/MWA/data/ , saved in a directory titled with the observation ID, and often stored in a sub directory. In our example, we'll use three different observation numbers, which are stored in a text file:

Observation list
$ more obs_list.txt
1093641624
1093641864
1093642232

 are stored. We are aiming to process the .uvfits files that live here:

Data locations
/fred/oz048/MWA/data/1093641624/test_8dec/uvdump_*.uvfits
/fred/oz048/MWA/data/1093641864/test_8dec/uvdump_*.uvfits
/fred/oz048/MWA/data/1093642232/test_8dec/uvdump_*.uvfits

to do that, we run the following commands (note this has the --no_run flag, meaning no jobs will be launched).  /fred/oz048/MWA/data is the default data directory for OzStar, so we don't have to specify that, just the obs IDs and the sub-dir name. Also, note copying and pasting this code won't work with the comments - you'll need to delete them for it to work.

CHIPS script (without running)
##This is equivalent to a 'module load chips', so sets up all the paths and dependencies needed for CHIPS
source /fred/oz048/jline/software/chips/module_load_chips.sh

##This command searches for uvfits, and if it finds it, setups up the correct sbatch scripts
run_CHIPS.py \
     --cluster=ozstar \ ##what cluster we are on
     --obs_list=/fred/oz048/jline/test_CHIPS_scripts/obs_list.txt \ ##search for these obs
     --uvfits_dir=test_8dec \ ##within the obs, look for this sub-dir
     --output_dir=/fred/oz048/jline/test_CHIPS_scripts/CHIPS_outputs/ \ ##dump outputs here
     --output_tag=test_ozstar_chips \ ##name the outputs with this tag
     --band=high \ ##this is a high-band observation. CHIPS expects low or high
     --obs_range=0,3 \ ##process all three observations listed in obs_list.txt
     --no_run ##do not launch sbatch jobs; just check arguments and generate scripts

which will generate the following scripts, located in a new directory called logs_test_ozstar_chips :

Slurm scripts
$ ls logs_test_ozstar_chips
run_clean_test_8dec_test_ozstar_chips.sh ##Cleans up intermediate data products
run_grid_1093641624_test_8dec_test_ozstar_chips.sh ##Grids the first observation
run_grid_1093641864_test_8dec_test_ozstar_chips.sh ##Grids the second observation 
run_grid_1093642232_test_8dec_test_ozstar_chips.sh ##Grids the third observation 
run_lssa_test_8dec_test_ozstar_chips_xx.sh ##Makes the XX power spectra
run_lssa_test_8dec_test_ozstar_chips_yy.sh ##Makes the YY power spectra 

if there are any errors in the input arguments, you should get some form of error message telling you what to do next. If this command runs find. To actually launch your jobs, remove the --no_run command. I usually put this all in a bash script, so you'd do something like:

Launching sbatch commands
$ more run_CHIPS.sh
source /fred/oz048/jline/software/chips/module_load_chips.sh

run_CHIPS.py \
     --cluster=ozstar \
     --obs_list=/fred/oz048/jline/test_CHIPS_scripts/obs_list.txt \
     --uvfits_dir=test_8dec \
     --output_tag=test_ozstar_chips \
     --output_dir=/fred/oz048/jline/test_CHIPS_scripts/CHIPS_outputs/ \
     --band=high --obs_range=0,3

$ source run_CHIPS.sh
Command run: sbatch --parsable run_grid_1093641624_test_8dec_test_ozstar_chips.sh
Job ID: 27145363
Command run: sbatch --parsable --dependency=afterok:27145363 run_grid_1093641864_test_8dec_test_ozstar_chips.sh
Job ID: 27145364
Command run: sbatch --parsable --dependency=afterok:27145363:27145364 run_grid_1093642232_test_8dec_test_ozstar_chips.sh
Job ID: 27145365
Command run: sbatch --parsable --dependency=afterok:27145363:27145364:27145365 run_lssa_test_8dec_test_ozstar_chips_xx.sh
Job ID: 27145366
Command run: sbatch --parsable --dependency=afterok:27145363:27145364:27145365 run_lssa_test_8dec_test_ozstar_chips_yy.sh
Job ID: 27145367
Command run: sbatch --parsable --dependency=afterok:27145366:27145366 run_clean_test_8dec_test_ozstar_chips.sh

which shows that you have launched a number of jobs, each with dependencies. The gridding must happen observation by observation (each of which is an array job), and you can't make the power spectra until you finished the gridding. Finally, the cleaning script should run after the power spectra step.

Debugging actual CHIPS runs: all the log outputs and error messages for this run will be output into logs_test_ozstar_chips , so if you jobs error, try checking the output logs and error messages located there. If everything runs fine, you can just delete that entire logs_test_ozstar_chips  directory (saves on the number of files quota).

Plotting the outputs

See (page to be written soon) for a full overview of the plotting commands available. But a quick example of how to make a 2D cross-power spectra using outputs of the OzStar example is below. Note the output tag for the files is a combination of the --output_tag and --uvfits_dir options that were fed to run_CHIPS.py :

Simple 2D plot
source /fred/oz048/jline/software/chips/module_load_chips.sh

plotchips_all.py \
    --basedir=/fred/oz048/jline/test_CHIPS_scripts/CHIPS_outputs/ \
    --polarisation='yy' \
    --chips_tag=test_8dec_test_ozstar_chips \
    --min_power=1e3 --max_power=1e15

this will produce a plot called chips2D_yy_test_8dec_test_ozstar_chips_crosspower.png which looks like this:

This is some random EoR1 data I found on OzStar so I have no idea of the quality or what peeling was done, but hey, it's a power spectrum!