Installing a Fully Distributed Cluster
Note: Make sure that your environment meets the prerequisites before you install Cloudify Manager and that you have read the installation and configuration guide and deployed the manager’s RPM.
Cloudify Cluster Architecture
Cloudify Manager 5.1 clusters are composed of three separate services that construct the entire Cloudify solution:
- Cloudify Management service – The Management service embeds the Cloudify workers framework, the REST API, the User Interface infrastructure and other backend services. The Cloudify Management service is a cluster of at least two Manager nodes running in an active/active mode.
- PostgreSQL database cluster – This service provides a high-availability PostgreSQL cluster based on Patroni. The cluster must consist of at least 3 nodes.
- RabbitMQ cluster – This service provides a high-availability RabbitMQ cluster based on the RabbitMQ best practices. The cluster must consist of 3 nodes.
Each of those services is accompanied by a customized monitoring service. The service monitors the node for some basic metrics and also service-specific: message broker nodes will have RabbitMQ monitoring enabled, database nodes – PostgreSQL and manager nodes – HTTP checks.
- An optional service is the load-balancer that is used to distribute the load between the different manager nodes.
Before you start the manual process of installing a Cloudify cluster, you might want to consider using the Cluster Manager package that automates it.
This guide describes the process of configuring and installing such a cluster:
In case you use an Externally hosted PostgreSQL or RabbitMQ, i.e. “bring your own”, please make sure you go over all sections and read the relevant information for this case.
Note: Before you proceed, make sure that all the required VMs are spinning, that they are all allocated with a public-IP, and that they are configured according to the prerequisites guide. If you use Cloudify best-practice, you would need 9 VMs + a load balancer. The VMs partitioning is 3 PostgreSQL nodes, 3 RabbitMQ nodes, and 3 Cloudify Manager service nodes.
Certificates Setup
Please refer to the Cluster certificates setup guide.
Installing Services
The Cloudify Manager cluster best-practice consists of three main services: PostgreSQL Database, RabbitMQ, and a Cloudify Management Service.
Each of these services is a cluster comprised of three nodes and each node should be installed separately by order.
Another optional service of the Cloudify Manager cluster is the Management Service Load Balancer, which should be installed after all the other components.
The following sections describe how to install and configure Cloudify Manager cluster services. The order of installation should be as follows:
- PostgresSQL Database Cluster
- RabbitMQ Cluster
- Cloudify Management Service
- Management Service Load Balancer
Preperation
- Ensure you have nine VMs with cfy_manager available on each(means, curl manager rpm and perform
sudo yum install <Cloudify RPM>
). - All VMs should be on the same network and if there is firewall/security group, make sure used ports are open and not blocking any of our services. See prerequisites page in order to see which ports used by PostgresSQL,RabbitMQ and manager.
- For each instance, please copy Cloudify license to host.
- Copy the /home/centos/.cloudify-test-ca directory from the VM where you generated the certs to the same location on the other VMs.
The fact that all of the certificates in our example resides in .cloudify-test-ca directory is because of the reason we generated test certificates with
cfy_manager generate-test-cert
command. Generally, each instance needs only its certificates and not all instances certificates, and certificates location can be different(just need to specify them in the node config.yaml).
(On this examples home directory is /home/centos)
PostgreSQL Database Cluster
The PostgreSQL database high-availability cluster is comprised of 3 nodes (Cloudify best-practice) or more.
Note Make sure the following ports are open for each node:
Port | Description |
---|---|
tcp/2379 | etcd port. |
tcp/2380 | etcd port. |
tcp/5432 | PostgreSQL connection port. |
tcp/8008 | Patroni control port. |
tcp/8009 | Monitoring service port. |
Locally Hosted PostgreSQL Database Cluster Installation
Configure the following settings in /etc/cloudify/config.yaml
for each PostgreSQL node:
manager:
private_ip: <ip of this host>
public_ip: <ip of this host>
postgresql_server:
postgres_password: '<strong password for postgres superuser>'
cert_path: '<path to certificate for this server>'
key_path: '<path to key for this server>'
ca_path: '<path to ca certificate>'
cluster:
nodes:
<first postgresql instance-name>:
ip: <private ip of postgres server 1>
<second postgresql instance-name>:
ip: <private ip of postgres server 2>
<third postgresql instance-name>:
ip: <private ip of postgres server 3>
# Should be the same on all nodes
etcd:
cluster_token: '<a strong secret string (password-like)>'
root_password: '<strong password for etcd root user>'
patroni_password: '<strong password for patroni to interface with etcd>'
# Should be the same on all nodes
patroni:
rest_password: '<strong password for replication user>'
# Should be the same on all nodes
postgres:
replicator_password: '<strong password for replication user>'
# For monitoring service(status reporter)
prometheus:
credentials:
username: <monitoring username>
password: <strong password for monitoring user>
cert_path: <certificate for prometheus, cert_path of this postgresql_server can be used>
key_path: <key for promethus, key_path of postgresql_server can be used>
ca_path: <ca for promethus, ca_path of postgresql_server can be used>
postgres_exporter:
# `password` is a placeholder and will be updated during config file rendering, based on postgresql_server.postgres_password
password: ''
sslmode: require
services_to_install:
- database_service
- monitoring_service
Execute on each node sequentially (i.e. do not start installing next instance unless the previous has been successfully installed):
cfy_manager install [--private-ip <PRIVATE_IP>] [--public-ip <PUBLIC_IP>] [-v]
After installing all nodes, On one node verify that everything looks healthy with: cfy_manager dbs list
.
Example:
manager:
private_ip: <ip of this host>
public_ip: <ip of this host>
postgresql_server:
postgres_password: 'areallystrongandsecretpasswordforpostgres'
cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
cluster:
nodes:
<first postgresql instance-name>:
ip: <private ip of postgres server 1>
<second postgresql instance-name>:
ip: <private ip of postgres server 2>
<third postgresql instance-name>:
ip: <private ip of postgres server 3>
etcd:
cluster_token: 'astrongandsecretpasswordlikestring'
root_password: 'anotherstrongandsecretbutdifferentpassword'
patroni_password: 'yetanotherstrongandsecretpassword'
patroni:
rest_user: patroni
rest_password: 'strongandsecretpatronirestpassword'
postgres:
replicator_password: 'stillanotherstrongandsecretpassword'
# For monitoring service(status reporter)
prometheus:
credentials:
username: 'monitoringusername'
password: 'longyeteasytorememberstringasapassword'
cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
postgres_exporter:
# `password` is a placeholder and will be updated during config file rendering, based on postgresql_server.postgres_password
password: ''
sslmode: require
services_to_install:
- database_service
- monitoring_service
Externally Hosted PostgreSQL Database Installation
Cloudify supports Microsoft’s Azure Database for Postgres as an external database option replacing Cloudify’s PostgreSQL deployment. For using Azure Database for Postgres see external database installation guide.
RabbitMQ Cluster
The RabbitMQ service is a cluster comprised of any amount of nodes, whereas Cloudify best-practice is three nodes.
Note Please refer to the RabbitMQ networking guide - Ports
to verify the open ports needed for a RabbitMQ cluster installation. Also tcp/8009
port should be opened to
access the monitoring service.
Locally Hosted RabbitMQ Cluster Installation
Configure and install the first RabbitMQ node and then the rest of the nodes.
For the first RabbitMQ node, configure the following settings in /etc/cloudify/config.yaml
:
manager:
private_ip: <ip of this host>
public_ip: <ip of this host>
rabbitmq:
username: '<secure username for queue management>'
password: '<secure password for queue management>'
cluster_members:
<host name of rabbit server 1>:
networks:
default: <private ip of rabbit server 1>
<host name of rabbit server 2>:
networks:
default: <private ip of rabbit server 2>
<host name of rabbit server 3>:
networks:
default: <private ip of rabbit server 3>
cert_path: '<path to certificate for this server>'
key_path: '<path to key for this server>'
ca_path: '<path to ca certificate>'
nodename: '<short host name of this rabbit server>'
# Should be the same on all nodes
erlang_cookie: '<a strong secret string (password-like)>'
# For monitoring service(status reporter)
prometheus:
credentials:
username: <monitoring username>
password: <strong password for monitoring user>
cert_path: <certificate for prometheus, cert_path of this rabbitmq can be used>
key_path: <key for promethus, key_path of rabbitmq can be used>
ca_path: <ca for promethus, key_path of rabbitmq can be used>'
services_to_install:
- queue_service
- monitoring_service
For the rest of rabbitmq nodes, just add join_cluster in the rabbitmq section, as in the following config.yaml:
manager:
private_ip: <ip of this host>
public_ip: <ip of this host>
rabbitmq:
username: '<secure username for queue management>'
password: '<secure password for queue management>'
cluster_members:
<host name of rabbit server 1>:
networks:
default: <private ip of rabbit server 1>
<host name of rabbit server 2>:
networks:
default: <private ip of rabbit server 2>
<host name of rabbit server 3>:
networks:
default: <private ip of rabbit server 3>
cert_path: '<path to certificate for this server>'
key_path: '<path to key for this server>'
ca_path: '<path to ca certificate>'
nodename: '<short host name of this rabbit server>'
# Should be the same on all nodes
erlang_cookie: '<a strong secret string (password-like)>'
join_cluster: '<hostname of first rabbit server>'
# For monitoring service(status reporter)
prometheus:
credentials:
username: <monitoring username>
password: <strong password for monitoring user>
cert_path: <certificate for prometheus, cert_path of this rabbitmq can be used>
key_path: <key for promethus, key_path of rabbitmq can be used>
ca_path: <ca for promethus, key_path of rabbitmq can be used>'
services_to_install:
- queue_service
- monitoring_service
Execute on each node sequentially (i.e. do not start installing next manager unless the previous has been successfully installed):
cfy_manager install [--private-ip <PRIVATE_IP>] [--public-ip <PUBLIC_IP>] [-v]
After installing all nodes, On one node, verify that everything looks healthy with: cfy_manager brokers list
Example:
manager:
private_ip: <ip of this host>
public_ip: <ip of this host>
rabbitmq:
username: cloudify
password: areallystrongandsecretpasswordforrabbit
cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
cluster_members:
<host name of rabbit server 1>:
networks:
default: <private ip of rabbit server 1>
<host name of rabbit server 2>:
networks:
default: <private ip of rabbit server 2>
<host name of rabbit server 3>:
networks:
default: <private ip of rabbit server 3>
nodename: 'my_rabbitmq_host_2'
join_cluster: 'my_rabbitmq_host_1'
erlang_cookie: anothersecurepasswordlikestring
# For monitoring service(status reporter)
prometheus:
credentials:
username: 'monitoringusername'
password: 'longyeteasytorememberstringasapassword'
cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
services_to_install:
- queue_service
- monitoring_service
Externally Hosted RabbitMQ Installation
In order to install externally hosted RabbitMQ see external database installation guide.
Cloudify Management Service
The Cloudify Management service is a cluster comprised of two to ten nodes, whereas Cloudify best-practice is three nodes.
- Make sure the following ports are open for each node:
Port | Description |
---|---|
tcp/80 | REST API and UI. For improved security we recommend using secure communication (SSL), if your system is configured for SSL, this port should be closed. |
tcp/443 | REST API and UI. |
tcp/22 | For remote access to the manager from the Cloudify CLI. |
tcp/5671 | RabbitMQ. This port must be accessible from agent VMs. |
tcp/53333 | Internal REST communications. This port must be accessible from agent VMs. |
tcp/5432 | PostgreSQL connection port. |
tcp/8008 | Patroni control port. |
tcp/8009 | Monitoring service port. |
tcp/22000 | Filesystem replication port. |
- Please notice the ‘networks’ section in the config.yaml file. In case you use a load-balancer, you would need to specify its private IP in order for the different agents to connect to it. Please see further explanation in the “Accessing the Load Balancer Using Cloudify Agents” section of this guide under “Management Service Load Balancer”.
Configure the following settings in /etc/cloudify/config.yaml
for each Manager service cluster node:
Note: In case you want to use an externally hosted PostgreSQL database and an internally hosted RabbitMQ or vice versa, please use the relevant section from the following examples and use in your configuration.
In case of an internally hosted PostgreSQL database and an internally hosted RabbitMQ:
manager: private_ip: <ip of this host> public_ip: <ip of this host> security: ssl_enabled: true admin_password: '<strong admin password for Cloudify>' cloudify_license_path: '<path to Cloudify license file>' monitoring: username: <monitoring username> password: <strong password for monitoring user> rabbitmq: username: '<username you configured for queue management on rabbit, needs to be the same as in the RabbitMQ nodes config.yaml>' password: '<strong password you configured for queue management on rabbit>' ca_path: '<path to ca certificate>' cluster_members: <host name of rabbit server 1>: networks: default: <private ip of rabbit server 1> <host name of rabbit server 2>: networks: default: <private ip of rabbit server 2> <host name of rabbit server 3>: networks: default: <private ip of rabbit server 3> monitoring: username: <monitoring username> password: <strong password for monitoring user> postgresql_server: ca_path: <path to rabbitmq ca certificate> postgres_password: <the postgresql server password> cluster: nodes: <first postgresql instance-name>: ip: <private ip of postgres server 1> <second postgresql instance-name>: ip: <private ip of postgres server 2> <third postgresql instance-name>: ip: <private ip of postgres server 3> postgresql_client: ssl_enabled: true # Same password as the one of the PostgreSQL server. # THE PASSWORD WILL BE REMOVED FROM THE FILE AFTER THE INSTALLATION FINISHES server_password: '<the postgresql server password>' # If true, client SSL certificates will need to be supplied for database connections ssl_client_verification: true monitoring: username: <monitoring username> password: <strong password for monitoring user> # In case you use a load-balancer, you would need to specify its private IP # in order for the different agents to connect to it. networks: load-balancer: <load-balancer private IP address> ssl_inputs: internal_cert_path: '<path to this host certificate generated in the first step>' internal_key_path: '<path to this host key generated in the first step>' external_cert_path: '<can be same as internal_cert_path(for CLI)>' external_key_path: '<can be same as internal_key_path(for CLI)>' ca_cert_path: '<path to this host ca certificate>' external_ca_cert_path: '<path to external ca certificate for this server, can be the same one as ca_cert_path>' postgresql_client_cert_path: '<path to cert for this server>' postgresql_client_key_path: '<path to key for this server>' # For monitoring service(status reporter) prometheus: blackbox_exporter: ca_cert_path: <ca path for blackbox exporter> credentials: username: <monitoring username> password: <strong password for monitoring user> cert_path: <certificate for prometheus, cert_path of this host can be used> key_path: <key for promethus, key_path this host can be used> ca_path: <ca for promethus, ca_path this host can be used>' services_to_install: - manager_service - monitoring_service
Execute on each node sequentially (i.e. do not start installing next manager unless the previous has been successfully installed):
cfy_manager install [--private-ip <PRIVATE_IP>] [--public-ip <PUBLIC_IP>] [-v]
Example:
manager:
private_ip: <ip of this host>
public_ip: <ip of this host>
security:
ssl_enabled: true
admin_password: strongsecretadminpassword
cloudify_license_path: /home/centos/license.yaml
monitoring:
username: monitoringusername
password: longyeteasytorememberstringasapassword
rabbitmq:
username: cloudify
password: areallystrongandsecretpasswordforrabbit
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
cluster_members:
<host name of rabbit server 1>:
networks:
default: <private ip of rabbit server 1>
<host name of rabbit server 2>:
networks:
default: <private ip of rabbit server 2>
<host name of rabbit server 3>:
networks:
default: <private ip of rabbit server 3>
monitoring:
username: monitoringusername
password: longyeteasytorememberstringasapassword
postgresql_server:
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
postgres_password: areallystrongandsecretpasswordforpostgres
cluster:
nodes:
<first postgresql instance-name>:
ip: <private ip of postgres server 1>
<second postgresql instance-name>:
ip: <private ip of postgres server 2>
<third postgresql instance-name>:
ip: <private ip of postgres server 3>
postgresql_client:
ssl_enabled: true
# Same password as the one of the PostgreSQL server.
# THE PASSWORD WILL BE REMOVED FROM THE FILE AFTER THE INSTALLATION FINISHES
server_password: 'areallystrongandsecretpasswordforpostgres'
# If true, client SSL certificates will need to be supplied for database connections
ssl_client_verification: true
monitoring:
username: monitoringusername
password: longyeteasytorememberstringasapassword
# In case you use a load-balancer, you would need to specify its private IP
# in order for the different agents to connect to it.
networks:
load-balancer: <load-balancer private IP address>
ssl_inputs:
internal_cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
internal_key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
external_cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
external_key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
ca_cert_path: '/home/centos/.cloudify-test-ca/ca.crt'
external_ca_cert_path: '/home/centos/.cloudify-test-ca/ca.crt'
postgresql_client_cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
postgresql_client_key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
# For monitoring service(status reporter)
prometheus:
blackbox_exporter:
ca_cert_path: '/home/centos/.cloudify-test-ca/ca.crt'
credentials:
username: 'monitoringusername'
password: 'longyeteasytorememberstringasapassword'
cert_path: '/home/centos/.cloudify-test-ca/<ip of this host>.crt'
key_path: '/home/centos/.cloudify-test-ca/<ip of this host>.key'
ca_path: '/home/centos/.cloudify-test-ca/ca.crt'
services_to_install:
- manager_service
- monitoring_service
Management Service Load Balancer
The Cloudify setup requires a load-balancer to direct the traffic across the Cloudify Management service cluster nodes. Any load-balancer can be used provided that the following are supported:
- The load-balancer directs the traffic over the following ports to the Manager nodes based on round robin or any other load sharing policy:
- Port 443 - REST API & UI.
- Port 8009 - Monitoring Service to Manager communication and between the Monitoring Services.
- Port 53333 - Agents to Manager communication.
- Note Port 80 is not mentioned and should not be load balanced because the recommended approach is to use SSL.
- Session stickiness must be kept.
Accessing the Load Balancer Using Cloudify Agents
In case you use a load-balancer and you want Cloudify Agents to communicate with it instead of a specific Cloudify Manager node, you can use the following Multi-Network Management guide and specify the load-balancer private-IP as the value of the ‘external’ key under ‘networks’. Moreover, In case you want all communication of the Cloudify Agents to go through the load-balancer, you can specify its private-IP as the value of the ‘default’ key under ‘networks’ (as shown in the config.yaml above).
Installing a Load Balancer
Note Although the load-balancer is not provided by Cloudify, here is a simple example of HAProxy as a load-balancer.
In order to use HAProxy as a load-balancer, you would first need to download HAProxy to your machine and set the relevant certificates.
Afterwards, you would need to configure HAProxy as the Cloudify Managers’ load-balancer, and you can do so using the following configuration:
global
maxconn 100
tune.ssl.default-dh-param 2048
defaults
log global
retries 2
timeout client 30m
timeout connect 4s
timeout server 30m
timeout check 5s
listen manager
bind *:80
bind *:443 ssl crt /etc/haproxy/cert.pem
redirect scheme https if !{ ssl_fc }
mode http
option forwardfor
stick-table type ip size 1m expire 1h
stick on src
option httpchk GET /api/v3.1/status
http-check expect status 401
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server manager_<first manager private-ip> <first manager public-ip> maxconn 100 ssl check check-ssl port 443 ca-file /etc/haproxy/ca.crt
server manager_<second manager private-ip> <second manager public-ip> maxconn 100 ssl check check-ssl port 443 ca-file /etc/haproxy/ca.crt
server manager_<third manager private-ip> <third manager public-ip> maxconn 100 ssl check check-ssl port 443 ca-file /etc/haproxy/ca.crt
Post Installation
Update the CLI
Update all remote CLI instances (not hosted on the manager) to the newly deployed Cloudify version. Please refer to the CLI installation guide for further instructions.
Run the following command from the client in order to connect to the load-balancer:
cfy profiles use <load-balancer host ip> -u <username> -p <password> -t <tenant-name>
In case you haven’t mentioned the license path in the config.yaml file of the Manager installation, you can upload a valid Cloudify license from the client using the following command:
cfy license upload <path to the license file>
Day 2 cluster operations
Please refer to the Day 2 cluster operations guide for further operations regarding the Cloudify active-active cluster.