...
Detailed information on how to upload to acacia is given in the pawsey help/support page!
Setting up ACACIA
Use the link: use Acacia can be found at https://support.pawsey.org.au/documentation/display/US/Acacia+-+Common+Usage.
Table of Contents |
---|
...
Setting up
Creating keys
...
- and login using your Pawsey username and password
...
Once you are in the Acacia tab, Storage name shows all the relevant storage that is assigned under the specific user. For example, USERNAME, mwavcs etc
- Click on the storage name where you want to upload the data.
- Press on the View Keys button. This shows the existing keys for the user.
- As first time user, click on the Create New Key button.
- Go to the ACACIA tab
...
- On doing so, it will show :
- Click View Keys:
- You will need to create a key for each "Storage" which you would like to use. Select the desired Storage Name from the dropdown, click Create New Key, and click Yes when prompted to confirm.
- You will be shown an Access ID and a Secret Key that is very very important to set up acacia on garrawarla etc.
NOTE: Make sure that you copy the Access ID and the Secret Key before closing the pop up window.
- Once you have the ID and Key, login to Garrawarla/magnus etc. Then type:
mc -h
mc is MinIO Client for cloud storage and filesystems. -h option shows all the functions of mc.
- The next step is to setup an alias for the storage:
mc alias set <ALIAS NAME> https://acacia.pawsey.org.au ACCESS ID Secret Key
The alias name that you set will be used from now on to upload and access data on acacia.
Uploading the data
- The next step is to create a bucket where all your data files will be uploaded. This can be thought of like a directory in our system. It can be done by:
set +o history
mc mb <ALIAS_NAME>/<BUCKET_NAME>
set -o history
This will create the bucket according to the name you have given. Once the bucket is created we can start uploading our data there.
- To upload the data:
mc cp <PATH_TO_FILE> <ALIAS_NAME>/<BUCKET_NAME>/<NAME_OF_FILE>
Other essentials
- To download an object from Acacia:
mc cp <ALIAS_NAME/BUCKET_NAME/NAME_OF_FILE> <PATH_TO_DOWNLOAD_DIRECTORY>
- To list the buckets in your account using MC, use the mc ls command:
...
- you will need later. The Access ID is easily obtained on the portal website, but the Secret Key will not be shown again. COPY THE ACCESS ID AND SECRET KEY AND KEEP THEM SOMEWHERE SAFE.
- If you lose the secret key, the easiest recovery method is to simply delete the key and create a new one.
Pawsey environment
The next step is to setup your environment on Pawsey so that you can access the Acacia "S3" system from Garrawarla, etc. There are two clients available for doing so:
MinIO Client ("mc")As at March 2024, minio client is no longer supported or available on Pawsey systems.- rclone
The set up for each client is different, as detailed in the following subsections. It has been reported that rclone seems to be the more robust choice for very large file transfers, and that mc does not seem to be able to handle large file transfers.
rclone
Create the following file:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
[<ALIAS_NAME>]
type = s3
provider = Other
access_key_id = <ACCESS_ID>
secret_access_key = <SECRET_KEY>
endpoint = https://projects.pawsey.org.au
acl = public-read-write
bucket_acl = public-read-write |
- ALIAS_NAME can be anything, but a sensible choice is the Storage Name (e.g. mwasci in the screenshots above)
- ACCESS_ID is the Access ID of the key created to access the given storage
- SECRET_KEY is the Secret Key given to you at the time your created the key.
To use rclone commands, you will need to load the rclone module:
Code Block | ||||
---|---|---|---|---|
| ||||
module load rclone |
You are now set up to use rclone to move data to/from your Acacia storage.
AWS Client
Configuration for aws
Code Block | ||
---|---|---|
| ||
user@setonix:~> tee -a $HOME/.aws/credentials <<EOF
[<profilename>]
aws_access_key_id=
aws_secret_access_key=
EOF
user@setonix:~> tee -a $HOME/.aws/config <<EOF
[profile <profilename>]
output=json
EOF |
To use aws client commands, you will need to load the aws client module:
Code Block | ||||
---|---|---|---|---|
| ||||
module load awscli/1.16.308 #Garrawarla
module load aws-cli/2.13.0 #Setonix |
To create a new bucket using AWS S3 CLI, use the S3 mb command:
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 mb s3://<BUCKET_NAME>
Where:
<PROFILE_NAME>
is the name you gave to the account credentials when configuring AWS S3 CLI.<BUCKET_NAME>
is the name you want to give your bucket, subject to naming requirements.
To delete a bucket using AWS S3 CLI, use the S3 rb command:
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 rb s3://<BUCKET_NAME>
Where:
<PROFILE_NAME>
is the name you gave the storage space when configuring aws S3 cli.<BUCKET_NAME>
is the name of the bucket you want to remove.
To list the buckets in your account using AWS S3 CLI, use the S3 ls command:
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 ls
Where:
<PROFILE_NAME>
is the name you gave to the account credentials when configuring AWS S3 CLI.
To list the objects in a bucket: mc ls <ALIAS_NAME>
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 ls s3://<BUCKET_NAME>
Where:
<BUCKET_NAME>
is the name of the bucket you want to list the objects within.
To list objects in a pseudo folder:
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 ls s3://<BUCKET_NAME>/<PREFIX>
Essential links:
- To create buckets:
Where:
<PREFIX>
is the name of the pseudo folder you want to list the objects within.
To upload an object using AWS S3 CLI, use the S3 cp command:
> aws --endpoint-url=https://
...
...
- To upload and download data:
https://support.pawsey.org.au/documentation/pages/viewpage.action?pageId=104302741
- To list buckets and data in buckets:
...
--profile=<PROFILE_NAME> s3 cp <SOURCE>
s3://<bucket>/
<TARGET>
Where:
<SOURCE>
is the filesystem path and name of the file you want to upload.<TARGET>
is the key of the object on Acacia, so bucket name, pseudo folder (optional), and object name. You can specify any object name to “rename” the file on upload.
To download an object from Acacia:
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 cp s3://<bucket>/<SOURCE> <TARGET>
Where:
<SOURCE>
is the key of the object on Acacia you want to download, so bucket name, pseudo folder (optional), and object name.<TARGET>
is the filesystem path where you want the object to be downloaded to.
To delete an object using AWS S3 CLI, use the rm
command:
> aws --endpoint-url=https://acacia.pawsey.org.au --profile=<PROFILE_NAME> s3 rm s3://<BUCKET_NAME>/<OBJECT_NAME>
Where:
<PROFILE_NAME>
is the name you gave to the account credentials when configuring AWS S3 CLI.<BUCKET_NAME>
is the name of the bucket containing the object.<OBJECT_NAME>
is the name of the object to remove.
The following command removes all objects:
> aws --endpoint-url=https://acacia.pawsey.org.au --profile=<PROFILE_NAME> s3 rm s3://<BUCKET_NAME> --recursive
> aws --endpoint-url=https://acacia.pawsey.org.au --profile=<PROFILE_NAME> s3 rm s3://<BUCKET_NAME>/<PREFIX> --recursive
Where:
<PROFILE_NAME>
is the name you gave to the account credentials when configuring AWS S3 CLI.<BUCKET_NAME>
is the name of the bucket containing the object.<OBJECT_NAME>
is the name of the object to remove.
To share an object using AWS S3 CLI, use the presign
command:
> aws --endpoint-url=https://projects.pawsey.org.au --profile=<PROFILE_NAME> s3 presign s3://<BUCKET_NAME>/<OBJECT_NAME>
Where:
<PROFILE_NAME>
is the name you gave to the account credentials when configuring AWS S3 CLI.<BUCKET_NAME>
is the name of the bucket containing the object.<OBJECT_NAME>
is the name of the object to share.
Example workflows
Some example workflows are given at https://support.pawsey.org.au/documentation/display/US/ListSupercomputing+buckets+and+objects#project+example. These examples are designed for Setonix, but can be used almost exactly as is from hpc-data. The only (necessary?) thing that needs to change is the name of the partition, which is "copy" on Setonix, but "copyq" elsewhere:
Code Block | ||||
---|---|---|---|---|
| ||||
#SBATCH --partition=copyq |
Note that these example scripts may require access to /scratch, which is accessible from hpc-data, but not from garrawarla or galaxy.
...