Backup and Restore Using pgBackRest (Beta)

Available from version 2022.2

To create an Alation backup, you can now use the pgBackRest tool instead of the default backup tool pg_probackup. Using pgBackRest should provide a noticeable improvement of the backup performance.

The pgBackRest tool is available with Backup V2 only. There is currently a limitation where scheduled backups are not supported and need to be scheduled using a crontab job.

In order to use pgBackRest, you need to first enable it on your Alation instance.

Enable pgBackRest

To enable pgbackRest:

  1. Use SSH to connect to your Alation instance.

  2. Enter the Alation shell.

    sudo /etc/init.d/alation shell
    
  3. Run the following command:

    alation_conf alation.feature_flags.enable_pgbackrest -s True
    
  4. Enabling pgBackRest changes the WAL archive directory from pg_probackup (default) to pgbackrest. This change needs to be propagated down to the Postgres configuration on the Alation server. For this to happen, run the enable_backupv2 action after enabling pgBackRest:

    alation_action enable_backupv2
    

    No restart is required.

  5. Because pgBackRest does not support Celery-based scheduled backups, you need to disable them to avoid a daily backup failure. You can schedule pgBackRest backups using Crontab: Schedule pgBackRest Backups.

    Run the following command to disable daily backup schedule:

    alation_conf snapshot.enabled -s False
    

    Important

    If you later decide to return to the default backup tool, you’ll need to re-enable the daily backup schedule and disable the corresponding cron job.

6. If you are going to continue the backup configuration at this point, stay in the Alation shell. If not, exit the Alation shell: exit.

Except for backup scheduling, pgBackRest supports most of the backup configuration options available for Backup V2. With pgBackRest enabled, you can use incremental backups. For more information, see Configure Backup V2.

After enabling pgBackRest, decide if you want to exclude the Postgres backup from the compressed Alation backup. Excluding the Postgres backup from the Alation backup can noticeably decrease the backup creation time and is recommended for large implementations of the Catalog. However, in this case the Postgres backup must be maintained separately from the Alation backup files. Maintaining separate Postgres backups also affects the restore process as the Postgres backup needs to be provided in addition to the Alation backup.

Exclude Postgres Backup from Alation Backup

When using pgBackRest, you can exclude the Postgres backup from the Alation backup and maintain Postgres backups separately.

Whether or not the Postgres backup is excluded from the Alation backup is controlled by the alation_conf parameter alation.backup_v2.pgbackup_compression.

By default, the Postgres backup is included in the Alation backup and compressed together with the backup of the Alation application into the Alation .tar.gz file. The default state is reflected by the following setting of the parameter alation.backup_v2.pgbackup_compression:

alation.backup_v2.pgbackup_compression = True

To exclude the Postgres backup from the Alation backup, you need to modify the value of alation.backup_v2.pgbackup_compression and set it to False.

To exclude the Postgres backup from the Alation backup:

  1. Use SSH to connect to your Alation instance.

  2. Enter the Alation shell:

    sudo /etc/init.d/alation shell
    
  3. Disable the Postgres compression:

    alation_conf alation.backup_v2.pgbackup_compression -s False
    
  4. Next, you can designate a directory on the Alation host to store the uncompressed Postgres backups. By default, the Postgres backups will be created in the directory data2/pgbackrest. You can create a different directory in the Alation chroot for the Postgres backups. Use the following command to designate a custom pgBackRest directory:

    alation_conf pgsql.pgbackrest.repo_dir_name -s path/to/directory
    
  5. Exit the Alation shell.

    exit
    

With this pgBackRest configuration, the next backup process will create three backup files:

  • In the backup directory data2/backup (default) or in the directory that you designated for backups using the parameter alation.backup.data_dump_dir, Alation will create the Alation backup and the Event Bus backup:

    • <timestamp_version>_alation_backup.tar.gz - A compressed Alation backup that does not include the Postgres backup.

    • <timestamp_version>_alation_eb_backup.tar.gz - A compressed Event Bus backup.

  • Separately, in the directory data2/pgbackrest or in the directory that you designated for Postgres backups using the parameter pgsql.pgbackrest.repo_dir_name, Alation will create a separate Postgres backup. Note that the Postgres backup is not one file but a set of files and directories.

Note

For more information on how to create backups, see Create a Manual Backup <Backup_V2-Manual_Backup_V2>.

Manage the Postgres Backups

Use the recommendations in this section to maintain Postgres backups when they are excluded from the Alation backup file.

Set the Number of Backups to Retain

You can configure the number of backups you want to store in the designated Postgres backup directory.

Use the alation_conf parameter alation.backup.data_dump_versions to configure the number of backups you want to retain. For example, when this parameter is set to 5 (default), Alation will keep five latest Alation backup files, five corresponding Event Bus files, and five corresponding Postgres backups. If incremental backups are enabled, Alation will keep five latest full backups, the corresponding incremental backups, and the corresponding Postgres incremental backups. Older backups will be automatically removed.

To set the number of backups to retain:

From the Alation shell, run the following command:

alation_conf alation.backup.data_dump_versions -s <number>

Example

alation_conf alation.backup.data_dump_versions -s 6

No restart is required.

Configure External Storage on AWS S3

You can use AWS S3 as external storage for Postgres backups. If AWS S3 is configured as external storage, every time you run the backup action or when a backup is created on schedule, Alation will copy the Postgres backup to the designated S3 bucket. Note that the backups are copied to S3, not moved. They will remain available in the backup folder of the Alation host, too. S3 serves as additional remote storage.

When restoring Alation, you can specify the S3 bucket as the source of the Postgres backup in order to restore Alation.

Note

The existing configuration option for moving Alation backups to external storage alation.backup.post_script.path does not cover separate Postgres backups and only relocates the Alation backup and the Event Bus backup.

Perform this configuration after enabling pgBackRest with Postgres compression disabled.

Prerequisites

  1. You will need to specify information about your AWS S3 bucket in this configuration. Make sure you have these values at hand:

    • S3 bucket name

    • Your AWS S3 region

    • AWS API key

    • AWS API secret

Your Alation instance must be able to establish a connection to AWS S3. Make sure your AWS S3 environment allows connections from the Alation IP address.

Configuration

To configure the external Postgres backup storage on AWS S3:

  1. From the Alation shell, set the following parameters:

    alation_conf pgsql.pgbackrest.redundancy.enabled -s True
    alation_conf pgsql.pgbackrest.redundancy.type -s s3
    alation_conf pgsql.pgbackrest.redundancy.s3.endpoint -s s3.amazonaws.com
    alation_conf pgsql.pgbackrest.redundancy.s3.bucket -s <your_ AWS_S3_bucket>
    alation_conf pgsql.pgbackrest.redundancy.s3.region -s <AWS_region_for_S3_bucket>
    alation_conf pgsql.pgbackrest.redundancy.s3.key -s <AWS_API_key>
    alation_conf pgsql.pgbackrest.redundancy.s3.secret -s <AWS_API_secret>
    

    Note

    Alation will store the AWS API key and secret in the encrypted format.

  2. No restart is required. When you run the backup action next time, Alation will copy the Postgres backup to the S3 bucket as per this configuration.

    Important

    When you restore Alation from the backup on a new instance, before you run the restore command, you will need to repeat the same configuration for AWS S3 as was on the old instance. See Restore Alation with Postgres Backup on S3.

Restore Alation with Postgres Backup on S3

Available from version 2022.2

The steps in this section apply if you are using the pgBackRest tool to take Alation backups. They require a separate Postgres backup stored externally on AWS S3.

On the Alation instance where you are going to restore:

  1. Copy the Alation and Event Bus backup files to the /data2/backup folder. The backup files must be accessible from inside the Alation chroot. Do not copy the pgbackrest directory from the old instance.

    Important

    Make sure to use two backup files taken by the same backup process. They will have the same timestamp and version, for example:

    • Alation backup file: 202109232027_10-0-0-147420_alation_backup.tar.gz

    • Event Bus backup file: 202109232027_10-0-0-147420_alation_eb_backup.tar.gz

  2. Enter the Alation shell:

    sudo /etc/init.d/alation shell
    
  3. Using the alation_conf command, change the alation.backup.restore_file parameter value to reflect the path to the Alation backup file. Substitute <timestamp_version>_alation_backup.tar.gz with your real file name.

    alation_conf alation.backup.restore_file -s /data2/restore/<timestamp_version>_alation_backup.tar.gz
    
  4. Using the alation_conf command, change the alation.backup.eb_restore_file parameter value to reflect the path to the Event Bus backup file. Substitute <timestamp_version>_alation_eb_backup.tar.gz with your real file name.

    alation_conf alation.backup.eb_restore_file -s /data2/restore/<timestamp_version>_alation_eb_backup.tar.gz
    
  5. Specify the information of the external storage on S3. This information is required to restore Postgres. This configuration should be the same as existed on the old instance that is restored.

    Use these commands to provide the information about the backup storage on S3:

    alation_conf pgsql.pgbackrest.redundancy.enabled -s True
    alation_conf pgsql.pgbackrest.redundancy.type -s s3
    alation_conf pgsql.pgbackrest.redundancy.s3.endpoint -s s3.amazonaws.com
    alation_conf pgsql.pgbackrest.redundancy.s3.bucket -s <your_ AWS_S3_bucket>
    alation_conf pgsql.pgbackrest.redundancy.s3.region -s <AWS_region_for_S3_bucket>
    alation_conf pgsql.pgbackrest.redundancy.s3.key -s <AWS_API_key>
    alation_conf pgsql.pgbackrest.redundancy.s3.secret -s <AWS_API_secret>
    
  6. Run the restore command to perform the restore. Note that this command overwrites all existing data on the instance if any exists. It will download the Postgres data from the S3 bucket you have provided in alation_conf.

    alation_action destructive_restore_all
    

Restore Alation with a Separate Postgres Backup

The steps in this section apply:

  • If you are using the pgBackRest tool to take Alation backups

  • If Postgres is excluded from the Alation backup file and you have a separate backup of the Postgres database.

Note

The Alation backup process creates a separate Postgres backup when you have disabled Postgres backup compression on your Alation instance. See Exclude Postgres Backup from Alation Backup for more details.

To restore an Alation instance when you have a separate Postgres backup:

  1. Copy the Alation and Event Bus backup files to the /data2/backup folder on the destination Alation instance.

    Important

    Make sure to use two backup files taken by the same backup process. They will have the same timestamp and version, for example:

    • Alation backup file: 202109232027_10-0-0-147420_alation_backup.tar.gz

    • Event Bus backup file: 202109232027_10-0-0-147420_alation_eb_backup.tar.gz

  2. Recursive-copy the Postgres backup directory from the source instance to the destination instance. The directory on the source instance with the Postgres backup is stored in the parameter bpgsql.pgbackrest.repo_dir_name in alation_conf.

    For example, if you are using secure copying, use the following guidelines:

    2.1. On the destination instance, outside of the Alaton chroot, create a directory:

    sudo mkdir -p /backup/restore
    

    2.2. Change ownership to the SSH user you are going to use when copying the Postgres backup to the destination instance.

    chown <SSH_user>:<SSH_user> /backup/restore
    

    2.3. On the source instance, enter the Alation shell.

    sudo etc/init.d/alation shell
    

    2.4. Secure copy the Postgres backup to the destination instance.

    pgbackrest_dir=$(alation_conf pgsql.pgbackrest.repo_dir_name --value-only)
    
    scp -r ${pgbackrest_dir} -i <destination instance private key> <SSH_user>@<destination instance IP>:/backup/restore
    

    2.5. Check if the pgbackrest directory is created.

    ls -al /data2/restore/pgbackrest
    
  3. Using the alation_conf command, change the alation.backup.restore_file parameter value to reflect the path to the Alation backup file. Substitute <timestamp_version>_alation_backup.tar.gz with your real file name.

    alation_conf alation.backup.restore_file -s /data2/restore/<timestamp_version>_alation_backup.tar.gz
    
  4. Using the alation_conf command, change the alation.backup.eb_restore_file parameter value to reflect the path to the Event Bus backup file. Substitute <timestamp_version>_alation_eb_backup.tar.gz with your real file name.

    alation_conf alation.backup.eb_restore_file -s /data2/restore/<timestamp_version>_alation_eb_backup.tar.gz
    
  5. Using the alation_conf command, change the pgsql.pgbackrest.repo_dir_name parameter value to reflect the path to the pgbackrest directory, for example:

    alation_conf pgsql.pgbackrest.repo_dir_name -s /data2/restore/pgbackrest
    
  6. Run the restore command to perform the restore. Note that this command overwrites all existing data on the instance if any exists:

    alation_action destructive_restore_all
    

Schedule pgBackRest Backups

Scheduled pgBackRest backups are based on UTC. The current backup scheduling capabilities using alation_conf are based on Celery task scheduling which uses PST. This results in a failure of scheduled backups when pgBackRest is enabled.

In order to schedule backups when using pgBackRest, turn off Celery-based daily backups and set up the corresponding system Cron job.

To schedule backups when using pgBackRest:

  1. On the Alation host, enter the Alation shell:

    sudo /etc/init.d/alation shell
    
  2. Make sure that the Celery-based backup schedule is turned off or turn it off.

    • To check if Celery-based schedule is turned off:

      alation_conf snapshot.enabled
      

      If this command returns False, it has already been turned off and you can proceed to the next step. If this command returns True, change the value to False:

      alation_conf snapshot.enabled -s False
      
  3. Exit the shell.

    exit
    
  4. Outside of chroot, add the following crontab entry. It sets the backup job to run daily at 4 AM.

    sudo crontab -l | { cat; echo "0 4 * * * source /opt/alation/alation/opt/alation/ops/alation_constants; sudo chroot "${ALATION_CHROOT}" /bin/su - ${ALATION_USER} -c "alation_action backup_all""; } | sudo crontab -
    
  5. Check the new setting.

    sudo crontab -l
    

    The output should be:

    * * * * * /opt/alation/alation/opt/alation/ops/actions/unjailed_update_auto_alation > /dev/null 2>&1
    
    0 4 * * * source /opt/alation/alation/opt/alation/ops/alation_constants; sudo chroot "${ALATION_CHROOT}" /bin/su - ${ALATION_USER} -c "alation_action backup_all"
    

You can generate your preferred crontab using https://crontab.guru and configure a custom crontab. Use the following example to change to a new schedule:

sudo crontab -e # replace "0 4 * * *" with "0 6 * * *"
sudo crontab -l

The output should be similar to:

* * * * * /opt/alation/alation/opt/alation/ops/actions/unjailed_update_auto_alation > /dev/null 2>&1

0 6 * * * source /opt/alation/alation/opt/alation/ops/alation_constants; sudo chroot "${ALATION_CHROOT}" /bin/su - ${ALATION_USER} -c "alation_action backup_all"

Location of pgBackRest Logs

Enabling and using pgBackRest does not change the location of the backup logs /var/lib/pgsql/13/ inside the Alation chroot.