Cloudify High Availability Cluster Upgrade Guide

Overview

These instructions explain how to upgrade a Cloudify High Availablity (HA) cluster from version 4.x to version 4.3.

Upgrade on new hosts

This is the recommended method. If something happen in upgrade process, you still have the old manager, working and functioning.

The key elements of upgrading a Cloudify HA cluster on new hosts are:

  1. Create and download snapshot.
  2. Save agent ssh keys.
  3. Install new version for master manager on new host.
  4. Install new version for standby managers on new host.
  5. Restore last snapshot.
  6. Reinstall agents.
  7. Start cluster on master.
  8. Join standby nodes to the HA cluster.

In-place upgrade

Upgrading Cloudify HA Cluster entails tearing down the existing Managers and installing a new version of Cloudify manager on the same hosts. You can also restore data from your existing instance to new instance.

The key elements of in-place upgrading a Cloudify HA cluster are:

  1. Create and download snapshot.
  2. Save /etc/cloudify/ssl folder of cluster’s master manager.
  3. Save agent ssh keys.
  4. Remove standby nodes from the cluster.
  5. Teardown managers.
  6. Clean managers after teardown.
  7. Install new version on master manager’s host (In-place installation).
  8. Install new version on standby managers’ host (In-place installation).
  9. Start HA cluster on master manager.
  10. Restore last snapshot.
  11. Join standby nodes to the HA cluster.

Upgrade Cloudify HA cluster

There are two methods to upgrade Cloudify HA cluster to version 4.3.

Upgrade on new hosts

This is the recommended method. If something happen in upgrade process, you still have the old manager, working and functioning.

Next steps allow you to go through upgrade to new hosts:

  1. Create snapshot on old Cloudify HA cluster and download it:

    cfy snapshots create my_snapshot  # --include-metrics #(optional)
    cfy snapshots download my_snapshot -o {{ /path/to/the/snapshot/file }}
    
  2. Save SSH keys from /etc/cloudify folder:

    cp –r /etc/cloudify/.ssh <backup_dir>
    
  3. Install new Cloudify HA cluster managers to new hosts (See Cloudify HA build guide. Chapter 3).

  4. Upload and restore snapshot to new master manager:

    cfy snapshots upload {{ /path/to/the/snapshot/file }} --snapshot-id <snapshot_name>
    cfy snapshots restore <snapshot_name>
    
  5. Reinstall agents:

    cfy agents install --all-tenants
    
  6. Start cluster on master manager

  7. Join replicas to the new Cloudify HA cluster

  8. Delete old cluster’s hosts

In-place upgrade

This method allows upgrading Cloudify HA cluster on the same hosts. You would run the risk of not being able to do a rollback should anything happen. In addition, in-place upgrades only work if the IP, AMQP credentials and certificates are left unchanged. Otherwise, you will not be able to communicate with the existing agents.

  1. Create a snapshot and download it

    cfy snapshots create my_snapshot # --include-metrics #(optional)
    cfy snapshots download my_snapshot -o {{ /path/to/the/snapshot/file }}
    
  2. Save SSL certificates and SSH key from /etc/cloudify folder

    cp -r /etc/cloudify/ssl <backup_dir>
    cp –r /etc/cloudify/.ssh <backup_dir>
    
  3. Save RabbitMQ credentials. Credentials can be found in next places:

    • /etc/cloudify/config.yaml
    • /opt/mgmtworker/work/broker_config.json
    • /opt/manager/cloudify-rest.conf
    • /etc/cloudify/cluster

      Default credentials:

      Username: **cloudify**
      Password: **c10udify**
      
  4. Teardown Cloudify managers. Repeat next steps on each manager:

    Different methods for different version:

    • 4.0 - 4.2:

      curl -o ~/delete_cluster_4_0_1.py https://raw.githubusercontent.com/cloudify-cosmo/cloudify-dev/master/scripts/delete_cluster_4_0_1.py
      
      sudo python ~/delete_cluster_4_0_1.py
      
      curl -o ~/cfy_teardown_4_0_0.sh https://raw.githubusercontent.com/cloudify-cosmo/cloudify-dev/master/scripts/cfy_teardown_4_0_0.sh
              
      sudo bash cfy_teardown_4_0_0.sh -f
      
    • 4.3.0 - 4.3.1:

      sudo cfy_manager remove -f
      sudo yum remove cloudify-manager-install
      
      curl -o ~/delete_cluster_4_0_1.py https://raw.githubusercontent.com/cloudify-cosmo/cloudify-dev/master/scripts/delete_cluster_4_0_1.py
      
      sudo python ~/delete_cluster_4_0_1.py
      
      curl -o ~/cfy_teardown_4_0_0.sh https://raw.githubusercontent.com/cloudify-cosmo/cloudify-dev/master/scripts/cfy_teardown_4_0_0.sh
      
      sudo bash cfy_teardown_4_0_0.sh -f
      
  5. Remove CLI profiles of deleted hosts.

    rm -rf ~/.cloudify/profiles/{{ Manager's IP address }}
    
  6. Reboot hosts.

  7. (Optional) Fix failed services.

    sudo systemctl daemon-reload
    sudo systemctl reset-failed
    
  8. Install new managers on the same hosts (See Cloudify HA build guide, chapter3).

  9. Put rabbitmq credentials and path to certificate files from old cluster into /etc/cloudify/config.yaml before run command cfy_manager install:

    rabbitmq:
    username: <username> #must be stored from old CFY HA cluster
    password: <password> #must be stored from old CFY HA cluster
    ssl_inputs:
    external_cert_path: <backup_dir>/ssl/cloudify_external_cert.pem
    external_key_path: <backup_dir>/ssl/cloudify_external_key.pem
    internal_cert_path: <backup_dir>/ssl/cloudify_internal_cert.pem
    internal_key_path: <backup_dir>/ssl/cloudify_internal_key.pem
    ca_cert_path: <backup_dir>/ssl/cloudify_internal_ca_cert.pem
    ca_key_path: <backup_dir>/ssl/cloudify_internal_ca_key.pem
    ca_key_password: ''
    
  10. Create cluster (More information in Cloudify HA Build Guide).

    cfy profiles use <Leader IP> -t default_tenant -u admin -p <admin password>
    cfy profiles use <Replica1 IP> -t default_tenant -u admin -p <admin password>
    cfy profiles use <Replica2 IP> -t default_tenant -u admin -p <admin password>
    cfy profiles use <Leader IP>
    cfy cluster start --cluster-node-name <Leader name>
    
  11. Restore the snapshot.

    cfy snapshots upload {/path/to/the/snapshot/file} --snapshot-id <snapshot_name>
    cfy snapshots restore
    
  12. Join replicas to the cluster

    cfy profiles use <Replica1 IP>
    cfy cluster join --cluster-node-name <Replica1 name> <Leader IP>
    cfy profiles use <Replica2 IP>
    cfy cluster join --cluster-node-name <Replica2 name> <Leader IP>