Rollback Plugin

Cloudify Utilities: Rollback Workflow

Description

Add rollback support for node instances in unresolved states due to a failure in the install workflow. Also, wrappers workflows for Cloudify lifecycle operations introduced.

Prerequisites

Deprecation Warning

On Cloudify 5.2 (future version), the Rollback Workflow will be deprecated in the Utilities plugin and will be replaced by a built-in workflow.

Supported workflows

Rollback workflow

Rollback workflow will look at each node state, decide if the node state is unresolved, and for those that are, execute the corresponding node operation that will get us back to a resolved node state.

Unresolved node instance states are:

After rollback, creating and configuring node instances become uninitialized. starting node instances become configured.

Parameters:

Notes:

Example

Example demonstrates rollback of two node instances.

Install the example blueprint

Install blueprint.

[root@9fbb5f2b0d4b offcial_examples]# cfy install rollback-to-configured-and-uninitialized.yaml  -b rollback-example -d rollback-example
Uploading blueprint rollback-to-configured-and-uninitialized.yaml...
 rollback-to-confi... |################################################| 100.0%
Blueprint uploaded. The blueprint's id is rollback-example
Creating new deployment from blueprint rollback-example...
Deployment created. The deployment's id is rollback-example
Executing workflow `install` on deployment `rollback-example` [timeout=900 seconds]
Deployment environment creation is pending...
2021-01-13 07:59:05.353  CFY <rollback-example> Starting 'create_deployment_environment' workflow execution
2021-01-13 07:59:05.354  LOG <rollback-example> INFO: Creating deployment work directory
2021-01-13 07:59:05.389  CFY <rollback-example> 'create_deployment_environment' workflow execution succeeded
2021-01-13 07:59:08.956  CFY <rollback-example> Starting 'install' workflow execution
2021-01-13 07:59:09.103  CFY <rollback-example> [node_three_j9y2wc] Validating node instance before creation: nothing to do
2021-01-13 07:59:09.104  CFY <rollback-example> [node_three_j9y2wc] Precreating node instance: nothing to do
2021-01-13 07:59:09.105  CFY <rollback-example> [node_three_j9y2wc] Creating node instance: nothing to do
2021-01-13 07:59:09.106  CFY <rollback-example> [node_three_j9y2wc] Configuring node instance: nothing to do
2021-01-13 07:59:09.107  CFY <rollback-example> [node_three_j9y2wc] Starting node instance
2021-01-13 07:59:09.375  CFY <rollback-example> [node_three_j9y2wc.start] Sending task 'script_runner.tasks.run'
2021-01-13 07:59:09.927  LOG <rollback-example> [node_three_j9y2wc.start] INFO: Downloaded resources/install.py to /tmp/M5HA0/install.py
2021-01-13 07:59:09.927  LOG <rollback-example> [node_three_j9y2wc.start] INFO: log without fail during install
2021-01-13 07:59:10.193  CFY <rollback-example> [node_three_j9y2wc.start] Task succeeded 'script_runner.tasks.run'
2021-01-13 07:59:10.194  CFY <rollback-example> [node_three_j9y2wc] Poststarting node instance: nothing to do
2021-01-13 07:59:10.195  CFY <rollback-example> [node_three_j9y2wc] Node instance started
2021-01-13 07:59:10.362  CFY <rollback-example> [node_one_ivbz80] Validating node instance before creation: nothing to do
2021-01-13 07:59:10.363  CFY <rollback-example> [node_one_ivbz80] Precreating node instance: nothing to do
2021-01-13 07:59:10.363  CFY <rollback-example> [node_one_ivbz80] Creating node instance: nothing to do
2021-01-13 07:59:10.365  CFY <rollback-example> [node_one_ivbz80] Configuring node instance: nothing to do
2021-01-13 07:59:10.367  CFY <rollback-example> [node_one_ivbz80] Starting node instance
2021-01-13 07:59:10.410  CFY <rollback-example> [node_two_wrbed2] Validating node instance before creation: nothing to do
2021-01-13 07:59:10.412  CFY <rollback-example> [node_two_wrbed2] Precreating node instance: nothing to do
2021-01-13 07:59:10.419  CFY <rollback-example> [node_two_wrbed2] Creating node instance
2021-01-13 07:59:10.653  CFY <rollback-example> [node_one_ivbz80.start] Sending task 'script_runner.tasks.run'
2021-01-13 07:59:10.963  CFY <rollback-example> [node_two_wrbed2.create] Sending task 'script_runner.tasks.run'
2021-01-13 07:59:11.241  LOG <rollback-example> [node_one_ivbz80.start] INFO: Downloaded resources/install_fail.py to /tmp/UURZS/install_fail.py
2021-01-13 07:59:11.241  LOG <rollback-example> [node_one_ivbz80.start] INFO: log and fail during install!
2021-01-13 07:59:11.440  CFY <rollback-example> [node_one_ivbz80.start] Task failed 'script_runner.tasks.run'
Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 793, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 456, in handle
    result = self._run_operation_func(ctx, kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 509, in _run_operation_func
    return self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 80, in run
    script_result = process_execution(script_func, script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 156, in process_execution
    script_func(script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 323, in eval_script
    exec(compile(open(script_path).read(), script_path, 'exec'), eval_globals)
  File "/tmp/UURZS/install_fail.py", line 4, in <module>
    raise Exception
Exception

2021-01-13 07:59:11.498  LOG <rollback-example> [node_two_wrbed2.create] INFO: Downloaded resources/install_fail.py to /tmp/L0VUJ/install_fail.py
2021-01-13 07:59:11.499  LOG <rollback-example> [node_two_wrbed2.create] INFO: log and fail during install!
2021-01-13 07:59:11.753  CFY <rollback-example> [node_two_wrbed2.create] Task failed 'script_runner.tasks.run'
Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 793, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 456, in handle
    result = self._run_operation_func(ctx, kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 509, in _run_operation_func
    return self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 80, in run
    script_result = process_execution(script_func, script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 156, in process_execution
    script_func(script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 323, in eval_script
    exec(compile(open(script_path).read(), script_path, 'exec'), eval_globals)
  File "/tmp/L0VUJ/install_fail.py", line 4, in <module>
    raise Exception
Exception

2021-01-13 07:59:26.673  CFY <rollback-example> [node_one_ivbz80.start] Sending task 'script_runner.tasks.run' [retry 1/60]

Cancel the install workflow (or wait until it will fail).

Check node instances states:

[root@9fbb5f2b0d4b offcial_examples]# cfy node-instances list
Listing all instances...

Node-instances:
+-------------------+------------------+---------+------------+----------+------------+----------------+------------+
|         id        |  deployment_id   | host_id |  node_id   |  state   | visibility |  tenant_name   | created_by |
+-------------------+------------------+---------+------------+----------+------------+----------------+------------+
|  node_one_ivbz80  | rollback-example |         |  node_one  | starting |   tenant   | default_tenant |   admin    |
| node_three_j9y2wc | rollback-example |         | node_three | started  |   tenant   | default_tenant |   admin    |
|  node_two_wrbed2  | rollback-example |         |  node_two  | creating |   tenant   | default_tenant |   admin    |
+-------------------+------------------+---------+------------+----------+------------+----------------+------------+

Showing 3 of 3 node-instances

See that node_one_ivbz80 state is starting and node_two_wrbed2 state is creating.

Run rollback workflow

[root@9fbb5f2b0d4b offcial_examples]# cfy executions start rollback -d rollback-example
Executing workflow `rollback` on deployment `rollback-example` [timeout=900 seconds]
2021-01-13 08:02:25.044  CFY <rollback-example> Starting 'rollback' workflow execution
2021-01-13 08:02:25.142  CFY <rollback-example> [node_two_wrbed2] Validating node instance after deletion: nothing to do
2021-01-13 08:02:25.172  CFY <rollback-example> [node_one_ivbz80] Stopping node instance
2021-01-13 08:02:25.173  CFY <rollback-example> [node_two_wrbed2] Rollback Stop: nothing to do, instance state is creating
2021-01-13 08:02:25.176  CFY <rollback-example> [node_two_wrbed2] Deleting node instance
2021-01-13 08:02:25.425  CFY <rollback-example> [node_two_wrbed2.delete] Sending task 'script_runner.tasks.run'
2021-01-13 08:02:25.478  CFY <rollback-example> [node_one_ivbz80] Validating node instance after deletion: nothing to do
2021-01-13 08:02:25.678  CFY <rollback-example> [node_one_ivbz80.stop] Sending task 'script_runner.tasks.run'
2021-01-13 08:02:26.023  LOG <rollback-example> [node_two_wrbed2.delete] INFO: Downloaded resources/uninstall_fail.py to /tmp/HCURA/uninstall_fail.py
2021-01-13 08:02:26.024  LOG <rollback-example> [node_two_wrbed2.delete] INFO: log and fail during uninstall!
2021-01-13 08:02:26.313  CFY <rollback-example> [node_two_wrbed2.delete] Task failed 'script_runner.tasks.run'
Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 793, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 456, in handle
    result = self._run_operation_func(ctx, kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 509, in _run_operation_func
    return self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 80, in run
    script_result = process_execution(script_func, script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 156, in process_execution
    script_func(script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 323, in eval_script
    exec(compile(open(script_path).read(), script_path, 'exec'), eval_globals)
  File "/tmp/HCURA/uninstall_fail.py", line 4, in <module>
    raise Exception
Exception

2021-01-13 08:02:26.314  CFY <rollback-example> [node_two_wrbed2] Ignoring task script_runner.tasks.run failure
2021-01-13 08:02:26.374  LOG <rollback-example> [node_one_ivbz80.stop] INFO: Downloaded resources/uninstall_fail.py to /tmp/1CWIY/uninstall_fail.py
2021-01-13 08:02:26.374  LOG <rollback-example> [node_one_ivbz80.stop] INFO: log and fail during uninstall!
2021-01-13 08:02:26.386  CFY <rollback-example> [node_two_wrbed2] Rollbacked node instance
2021-01-13 08:02:26.714  CFY <rollback-example> [node_one_ivbz80.stop] Task failed 'script_runner.tasks.run'
Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 793, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 456, in handle
    result = self._run_operation_func(ctx, kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 509, in _run_operation_func
    return self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 80, in run
    script_result = process_execution(script_func, script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 156, in process_execution
    script_func(script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 323, in eval_script
    exec(compile(open(script_path).read(), script_path, 'exec'), eval_globals)
  File "/tmp/1CWIY/uninstall_fail.py", line 4, in <module>
    raise Exception
Exception

2021-01-13 08:02:26.714  CFY <rollback-example> [node_one_ivbz80] Ignoring task script_runner.tasks.run failure
2021-01-13 08:02:26.714  CFY <rollback-example> [node_one_ivbz80] Stopped node instance
2021-01-13 08:02:26.801  CFY <rollback-example> [node_one_ivbz80] Rollback Delete: nothing to do, instance state is starting
2021-01-13 08:02:26.804  CFY <rollback-example> [node_one_ivbz80] Rollbacked node instance
2021-01-13 08:02:26.946  CFY <rollback-example> 'rollback' workflow execution succeeded
Finished executing workflow rollback on deployment rollback-example
* Run 'cfy events list 31c4189b-6404-430f-8839-5c933e058769' to retrieve the execution's events/logs

See that even though node_one_ivbz80.stop and node_two_wrbed2.delete still the rollback succeeded (ignore failures during rollback as explained above).

Check node instances states:

[root@9fbb5f2b0d4b offcial_examples]# cfy node-instances list
Listing all instances...

Node-instances:
+-------------------+------------------+---------+------------+---------------+------------+----------------+------------+
|         id        |  deployment_id   | host_id |  node_id   |     state     | visibility |  tenant_name   | created_by |
+-------------------+------------------+---------+------------+---------------+------------+----------------+------------+
|  node_one_ivbz80  | rollback-example |         |  node_one  |   configured  |   tenant   | default_tenant |   admin    |
| node_three_j9y2wc | rollback-example |         | node_three |    started    |   tenant   | default_tenant |   admin    |
|  node_two_wrbed2  | rollback-example |         |  node_two  | uninitialized |   tenant   | default_tenant |   admin    |
+-------------------+------------------+---------+------------+---------------+------------+----------------+------------+

Showing 3 of 3 node-instances

See that rollback handled unresolved node instances.

Wrapper workflows

Nine workflows introduced:

Wrapper workflows are workflows that wrap execution of the corresponding lifecycle operation with ignore_failure option.

For example, alt_create workflow will execute cloudify.interfaces.lifecycle.create.

All the wrapper workflows share the same parameters:

Example

This example demonstrates how to call alt_stop workflow.

[root@9fbb5f2b0d4b offcial_examples]# cfy executions start alt_stop -d rollback-example -p node_instance_ids=[node_one_ivbz80] -p ignore_failure=true
Executing workflow `alt_stop` on deployment `rollback-example` [timeout=900 seconds]
2021-01-13 08:40:47.049  CFY <rollback-example> Starting 'alt_stop' workflow execution
2021-01-13 08:40:47.163  CFY <rollback-example> [node_one_ivbz80] Starting operation cloudify.interfaces.lifecycle.stop
2021-01-13 08:40:47.353  CFY <rollback-example> [node_one_ivbz80.stop] Sending task 'script_runner.tasks.run'
2021-01-13 08:40:48.015  LOG <rollback-example> [node_one_ivbz80.stop] INFO: Downloaded resources/uninstall_fail.py to /tmp/TQ3V7/uninstall_fail.py
2021-01-13 08:40:48.016  LOG <rollback-example> [node_one_ivbz80.stop] INFO: log and fail during uninstall!
2021-01-13 08:40:48.241  CFY <rollback-example> [node_one_ivbz80.stop] Task failed 'script_runner.tasks.run'
Traceback (most recent call last):
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 793, in main
    payload = handler.handle()
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 456, in handle
    result = self._run_operation_func(ctx, kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/cloudify/dispatch.py", line 509, in _run_operation_func
    return self.func(*self.args, **kwargs)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 80, in run
    script_result = process_execution(script_func, script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 156, in process_execution
    script_func(script_path, ctx, process)
  File "/opt/mgmtworker/env/lib64/python3.6/site-packages/script_runner/tasks.py", line 323, in eval_script
    exec(compile(open(script_path).read(), script_path, 'exec'), eval_globals)
  File "/tmp/TQ3V7/uninstall_fail.py", line 4, in <module>
    raise Exception
Exception

2021-01-13 08:40:48.242  CFY <rollback-example> [node_one_ivbz80] Ignoring task script_runner.tasks.run failure
2021-01-13 08:40:48.242  CFY <rollback-example> [node_one_ivbz80] Finished operation cloudify.interfaces.lifecycle.stop
2021-01-13 08:40:48.303  CFY <rollback-example> 'alt_stop' workflow execution succeeded

See that cloudify.interfaces.lifecycle.stop operation executed on the node instance, which was given as a parameter, failed, but the whole workflow didn’t.