Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

VCS pulsar data processing

...

In this example, 2.4% of the files are not yet archived so the user should wait until 100% of the files are archived. If all the files have not been archived in over 5 days you should contact someone on the MWA operations team to ask them what may have gone wrong.

To download an entire observation (referred to by their observation IDs, GPS seconds, as labelled on the MWA data archive), use: 

Code Block
languagetext
vcs_download.nf --obsid <obs ID> --all

otherwise, if you only want some interval [begin, end] of the full observation, then use:

Code Block
languagetext
vcs_download.nf --obsid <obs ID> --begin <starting GPS second> --end <end GPS second>

A note on beginning and end times: the observation ID does not correspond to the starting GPS second of the data. In order to determine the beginning/end times of the observation data, useVCS data, you must use the ASVO. Once you have created an account and logged in, you can use the ASVO job dashboard to submit a download job. You must click on the "Voltage Download Job" tab, input the observation ID, offset (seconds from the beginning of observation) and duration (in seconds). For example, here is how you would download the first 600 seconds of an observation.

Image Added

You should then be able to see your download jobs on the ASVO job monitor. Once the job is complete, you can continue to the next step

The data will be downloaded to /astro/mwavcs/asvo/<ASVO job ID>. You can look at the first and last files in this directory to confirm your observation's first and last GPS second. To recombine raw data or untar combined data and put it in the correct directory can be done with:

Code Block
languagetext
mwavcs_metadb_utils.pydownload.nf --obsid <obs ID> --begin <starting GPS second> --end <end GPS second> --download_dir /astro/mwavcs/asvo/<ASVO job ID>


A note on beginning and end times: the observation ID does not correspond to the starting GPS second of the data. In order to determine the beginning/end times of the observation data, use:

Code Block
languagetext
mwa_metadb_utils.py <obs ID>

and this will help you determine the correct GPS times to pass to the -b/-e options. This applies for all jobs that have an optional beginning and end times.

...

  • dedicated calibrator: usually directly before or after the target observation on a bright calibrator source. These are stored on the MWA data archive as visibilities (they are run through the online correlator like normal MWA observations). You can find the observation ID for the calibrator by searching on the MWA data archive or by using the following command which will list compatible calibrator IDs sorted by how close they are in time to the observation:

    Code Block
    languagetext
    mwa_metadb_utils.py -c <obs ID>

    If this command generates an error, it may be due to the lack of calibration observations with the same frequency channels. If so try the Calibration Combine Method.

  • in-beam calibrator: using data from the target observation itself, correlating a section offline, and using those visibilities. See Offline correlation.


In order to download the calibration observation, set your MW ASVO API key as an environment variable. Below are some steps to do so:

...

  • If there are any tiles with max>3 on multiple channels it is worth flagging.
  • If there are any tiles that have a max=0 this is worth flagging (even if only one polarisation has max=0!) as it is contributing no signal and can cause errors in beamforming
  • It is best to only flag up to ~3 tiles at a time as bad tiles can affect other potentially good tiles
  • Sensitivity scales as ~sqrt(128-N) where N is the number of tiles flagged so if you start flagging more than 50 tiles will start to lower your sensitivity and maybe worth abandoning
  • Make sure you put the number in the right of the key (between flag and ?) into the flagged_tiles.txt file

In the "attempt_number_N" subdirectory are a chan_x_output.txt and phase_x_output.txt file that contains all of the recommended flags that the calibration plotting script creates. These can be useful when deciding which tile(s) to flag next. The following bash command will output the worst tile with the greatest gain value for each channel:

Code Block
languagetextbash
for i in $(ls chan*txt); do grep $(cat $i | cut -d '=' -f 3 | cut -d ' ' -f 1 | sort -n | tail -n 1 | head -n 1) $i; done

...

So for this example, you should flag tile 122.

If there is a single channel that is affecting your solution, you can flag it using the following method.


Make sure you have used the -X option when you ssh to enable an interactive terminal, then run:It is also interesting to look for obviously too low powers, which can be achieved by running the following, which will list each tile with the lowest gain value. Tiles with very small (near-zero) values should be flagged.

plot_BPcal_128T.py
Code Block
languagetext
bash
for i in $(ls chan*.txt);do grep $(cat $i | tail -n+3 | cut -d '=' -f BandpassCalibration_node<coarse3 channel| number>.dat -c

...

cut -d ' ' -f 1 | sort -n | head -1) $i;done


If there is a single channel that is affecting your solution, you can flag it using the following method.

Make sure you have used the -X option when you ssh to enable an interactive terminal, then run:

Code Block
languagetext
plot_BPcal_128T.py -f BandpassCalibration_node<coarse channel number>.dat -c

This should create an interactive plot that looks like this

...

Code Block
languagetext
beamform.nf --obsid <obs ID> --calid <cal ID> --all (or --begin <begin GPS time> --end <end GPS time>) --pointings <"RA string">_<"DEC string">,<"RA string">_<"DEC string"> [--ipfb] [--summed]

where ["RA string"] is formatted as "hh:mm:ss.ss" and ["DEC string"] is is formatted as "dd:mm:ss.ss" (including the sign: "+" or "-") you can also include multiple pointings by separating them by a spacecomma (,). The beamformer will output full polarisation (Stokes I, Q, U & V) PSRFITS files unless you used the --summed flag. The summed option only outputs Stokes I which means you can't create polarisation profiles but uses a quarter of the storage and for that reasons is useful for large scale searches.

...

where [obs ID], [pointing] and [channel] are defined as above.

Pulsar Processing on Garrawarla

PRESTO and DSPSR are not currently natively installed on Garrawarla so the following singularity commandschannel] are defined as above.

Pulsar Processing on Garrawarla

Pulsar software is difficult to install at the best of times, so the common packages are not currently natively installed on Garrawarla, but are provided via containterisation. There are two generic Singularity containers available to users that focus on two different aspects of pulsar science/processing.


Pulsar searching: the psr-search  container includes the most common pulsar searching tools, namely PRESTO  and riptide  (FFA implementation). It can be accessed as shown below.

Code Block
/pawsey/mwa/singularity/psr-search/psr-search.sif <command>

Pulsar follow-up analysis: the psr-analysis  container includes the common pulsar analysis tools, such as PSRCHIVE, DSPSR, and various pulsar timing packages  . It can be accessed as shown below.

Code Block
/pawsey/mwa/singularity/psr-analysis/psr-analysis.sif <command>

Both of these images have been built to enable interactivity if required. To use this, one must modify how they run the container as follows:

Code Block
singularity run -B ~/.Xauthority <container> <command>


--------------------------

There are other containers also available, but from January 2023 they will likely not be maintained or updated. They are "use at your own risk".

For PRESTO commands use:

...

Code Block
languagetext
singularity run -B ~/.Xauthority /pawsey/mwa/singularity/pulseportraiture/pulseportraiture.sif <command_here>

--------------------------


The Observation Processing Pipeline

...

Image accurate as of commit e6215f42c1d7c0b5a255721bc46840335170e579 to mwa_search repo

  1. Input Data: The OPP requires the calibrated and beamformed products of the VCS. These data can be acquired using the method described here.
  2. Pulsar Search: Given an observation ID, each pulsar within the field is identified and handed to the Pulsar Processing Pipeline (PPP)
  3. Initial Fold(s): Performs a PRESTO fold on the data. For slow pulsars, this will probably be 100 bins. Fast pulsars will be 50 bins.
  4. Classification: The resulting folds are classified as either a detection, or non-detection.
  5. Best Pointing: For the MWA's extended array configuration, there may be multiple pointings for a single pulsar. Should this be the case, we want to find the brightest detection to use for the rest of the pipeline. The "best" detection will be decided on and its pointing will be the only one used going forward.
  6. Post Folds:  A series of high-bin folds will be done. This is in order to find the highest time resolution fold we can do while still getting a detection.
  7. Upload products to database: Uploads the initial fold and best fold to the pulsar database.
  8. IQUV Folding: Uses DSPSR to fold on stokes IQUV, making a timescrunched archive. This archive is immediately converted back to PSRFITS format for use with PSRSALSA
  9. RM Synthesis: Runs RM synthesis on the archive. If successful, will apply this RM correction.
  10. RVM Fitting: Attempts to fit the Rotating Vector Model to the profile. If successful, will upload products to the database.

...

Code Block
languagetext
nswainston@garrawarla-1:~> ssh garrawarla-2


Resuming Nextflow Pipelines

One large benefit of Nextflow pipelines is that you can resume the pipelines. Once you have fixed the bug that caused the pipeline to crash simply relaunch the pipeline with the -resume option added. For the resume option to work you must run the command from the same directory and the working directory can't be deleted

Cleaning up the work directories

Once the pipeline is done and you are confident you don't need to resume the pipeline or need the intermediate files then it is a good idea to remove the Nextflow work directories to save space. By default, the work directories are stored in
/astro/mwavcs/$USER/<obsid>_work


Calibration Combining Method
Anchor
CalCombine
CalCombine

...

The name formatting for calibrator observations is the name of the calibrator source, an underscore and the centre frequency channel ID. Try and find a pair of calibration observations with the same calibrator source and, together, will cover the entire frequency range of the target observation. For the above example, this was 1195317056 and 1195316936. If you can't find any suitable calibration observations, then you can keep increasing the time search window up to 48 hours.

Now that you know which calibration observations you need, download and calibrate them as you normally would as explained in the calibration section. It is best to use the same values in the flagged_tiles.txt and flagged_channels.txt for all calibration obs to ensure your calibration solutions are consistent. Once the calibration is complete you can combine the two calibrations into one using the script

...

This will output the combined calibration solution to /astro/mwavcs/vcs/[obs ID]/cal/<first calibration ID> _<second calibration ID>/rts and you can treat the calibrator ID as <first calibration ID> _<second calibration ID> when being used in other scripts.

Deprecated Methods

These are old methods that are not maintained but may be useful if you need to do something specific or the new scripts have failed

Download (old python method)

...