How to Mirror SILO Datasets

You can maintain your own local copy of SILO datasets using the methods described below.

If you wish to mirror SILO datasets, please read the usage information about data mirroring on our Frequently asked questions page.


Station datasets

To mirror SILO's patched point datasets (point datasets at station locations):

  1. Contact SILO to obtain access

    SILO provides daily incremental updates which enable an exact copy of the patched datasets to be reconstructed. The system consists of:

    • a base dataset (updated when SILO undergoes a major update)
    • a monthly update (contains all changes since the base dataset was constructed)
    • a daily update (contains all changes since the monthly update was constructed).

    The various datasets are stored on the LongPaddock FTP server. Please contact us to arrange access. 

  2. Download the SILO software

    The patched datasets can be reconstructed from the incremental updates in any way the user chooses. For example, you may wish to reconstruct datasets containing only maximum temperature for stations in Victoria and discard all other stations and variables.

    SILO provides a software package which demonstrates one method for reconstructing the patched datasets. The package contains instructions on how to install and operate the software. Please note SILO provides this software in good faith and is not responsible for its use or misuse.

Note: SILO does not provide a facility for mirroring point datasets at grid cell locations because it would overload the system (there are approximately 290,000 grid cell locations). If you require temporal datasets at grid cell locations please download our gridded datasets and extract the relevant data.

Gridded datasets

To mirror SILO's gridded datasets you can either:

  • Use the Amazon Web Services Command Line Interface (CLI):

    1. Install  the AWS CLI
    2. Use the CLI sync command to mirror the data.

      For example, to mirror the monthly rainfall rasters into your local target folder:
      aws s3 sync s3://silo-open-data/annual/monthly_rain target --exact-timestamps

    Notes:

    • the first time you run the sync command it will download the entire dataset
    • you need to re-run the sync command every time you wish to update your local copy (sync will only download files that have changed)
    • the --exact-timestamps  option is required otherwise sync will not download files which have been updated but still have the same file size.
or
  • Manually download new and/or updated rasters:

    1. Download the entire set of rasters for the variable(s) that you wish to mirror.

      A list of files available for download can be obtained via URL:

      https://s3-ap-southeast-2.amazonaws.com/silo-open-data/annual/index.html
      Individual files can be downloaded using the methods described on our gridded data page. For example, the monthly rainfall rasters for 1989 can be downloaded using curl  as follows:
      curl 'https://s3-ap-southeast-2.amazonaws.com/silo-open-data/annual/monthly_rain/1989.monthly_rain.nc'
      Note: this step only needs to be done once.
    2. Each time you wish to update your local copy:

      Use the file listing:

      https://s3-ap-southeast-2.amazonaws.com/silo-open-data/annual/index.html
      to identify any new or updated files (see the file creation date), and then manually download the relevant file(s).

Please note SILO data are constantly evolving so you will need to determine how often you wish to update your local copy of the data. SILO data typically change due to:

  • Nightly updates: each night SILO ingests new data which have been collected recently. This typically only impacts the most recent datasets (rainfall datasets for the preceding 12 months and other variables for the preceding 3-6 months)
  • Bulk updates: SILO periodically regenerates the entire dataset to incorporate new features or to take advantage of data improvements. This typically impacts the entire time period spanned by the affected variable(s).

You may also wish to consider your network bandwidth and transfer costs when determining how often you update your local copy. The rasters are packed into annual files, each being around 410 MB in size for daily variables and around 14 MB for monthly rainfall.

Last updated: 9 July 2019