Cloudify System Processes and Logging Guide

The purpose of this document is to provide detailed information for:

Cloudify System Processes

In a Cloudify Manager environment, the following system processes exist:

User Command Description
cfyuser nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf Nginx web server (REST API) root process
nginx nginx: worker process Nginx web server (REST API) child process
stage_u+ /opt/nodejs/bin/node /opt/cloudify-stage/backend/server.js React.js web application (Cloudify front-end web UI)
cfyuser /opt/mgmtworker/env/bin/python /opt/mgmtworker/env/bin/celery worker Cloudify Manager management worker Celery processes
amqpinf+ /opt/amqpinflux/env/bin/python /opt/amqpinflux/env/bin/cloudify-amqp-influxdb Cloudify-specific RabbitMQ-to-InfluxDB transport
influxdb /usr/bin/influxdb -config=/opt/influxdb/shared/ config.toml InfluxDB
rabbitmq su rabbitmq -s /bin/sh -c /usr/lib/rabbitmq/bin/rabbitm q-server RabbitMQ service
cfyuser /opt/manager/env/bin/python /opt/manager/env/bin/gunicorn Gunicorn HTTP server
postgres /usr/pgsql-9.5/bin/postgres -D /var/lib/pgsql/9.5/data PostgreSQL database
cfyuser+ /opt/consul/consul agent -config-dir /etc/consul.d Consul
cfyuser /opt/manager/env/bin/python /opt/manager/env/bin/consul_watcher Consul-watcher

Cloudify SysV Init Services

Service Description
cloudify-amqpinflux Cloudify-specific RabbitMQ-to-InfluxDB transport service
cloudify-influxdb InfluxDB service
cloudify-mgmtworker Cloudify Manager management worker Celery service
cloudify-rabbitmq RabbitMQ service
cloudify-restservice Cloudify REST service
cloudify-riemann Cloudify policy manager
cloudify-stage Cloudify UI service
cloudify-check-runner.service check runner
cloudify-consul-watcher.service consul members watcher
cloudify-consul.service consul
cloudify-handler-runner.service handler runner
cloudify-postgresql.service PostgreSQL 9.5 database server
cloudify-syncthing.service Files syncthing

Cloudify Service Configuration Defaults

All Cloudify-specific service configurations can be found in /etc/sysconfig. This area is where default configuration data can be found as well as logging locations for service-specific troubleshooting. These are very useful when trying to understand how a service was instantiated and what logging configuration is being used.

This directory can also be used to derived each core service’s SysV init name. For instance, enumerating /etc/sysconfig will show a file called cloudify-amqpinflux. This is the name of the service, and thus to query the service status can be done using the command service cloudify-amqpinflux status.

Discovering Cloudify Services and Service

Statuses

The sections above describe how to identify a Cloudify service by looking directly at the output
of something like ps or by folder snooping. This is not always practical or desired and there are
other, more developer-friendly, ways of enumerating which Cloudify services are present and
how to harvest information about them.

The best starting point is to utilize the REST API of the manager to get service information.
Simply craft a GET request for the status endpoint: GET /status HTTP/1.1
If cURL and Python are available, it’s very easy to make the request as well as visualize the
returned data.

Code Block 1 REST

curl -X GET http://<manager-ip>/status | python -m json.tool

An example, partial, return is as follows:

Code Block 2 JSON

{
"services": [{
"display_name": "InfluxDB",
"instances": [{
"ActiveState": "active",
"Description": "InfluxDB Service",
"Id": "cloudify-influxdb.service",
"LoadState": "loaded",
"MainPID": 13129,
"SubState": "running",
"state": "running"
}
]
}, {
"display_name": "Celery Management"
"instances": [{
"ActiveState": "active",
"Description": "Cloudify Management Worker Service",
"Id": "cloudify-mgmtworker.service",
"LoadState": "loaded",
"MainPID": 18161,
"SubState": "running",
"state": "running"
}
]
}, {
"display_name": "RabbitMQ",
"instances": [{
"ActiveState": "active",
"Description": "RabbitMQ Service",
"Id": "cloudify-rabbitmq.service",
"LoadState": "loaded",
"MainPID": 12322,
"SubState": "running",
"state": "running"
}
]
},
]
}

With this information, in standard JSON format, it is easy to match a core Cloudify service with a
system-level process ID (MainPID) to begin further troubleshooting.

Cluster status

Cloudify provides API to get information about current cluster state. Using first it is easy to determine if the manager is in a cluster state.

Code Block 3 REST

curl -u user:password "http://<manager-ip>/api/v3.1/cluster"

Code Block 4 JSON

{
"initialized": true,
"consul": {
"leader": "172.20.0.3:8300"
},
"error": null,
"logs": [
{
"message": "HA Cluster configuration complete",
"timestamp": 1485778546965628,
"cursor": "opaque cursor value"
}
]
} 

While the second call response provides information about each node in the cluster and indicates which node is the current master.

Code Block 5 REST

curl --header -u user:password "http://<manager-ip>/api/v3.1/cluster/nodes"

Code Block 6 JSON

{
"items":
[
{
"initialized": true,
"online": true,
"master": true,
"host_ip": "172.20.0.2",
"name": "cloudify_manager_LMJZA2",
"credentials": "<REDACTED>"
}
]
}

Checking Manager Components

RabbitMQ

System Service

To check if the RabbitMQ broker is running (and to see many other details such as which applications are running, memory allocation, and other performance metrics), simply run the following command:

Code Block 7 bash

sudo rabbitmqctl -n cloudify-manager@localhost status 

An error message will be presented if the service has an issue such as a failed broker.

Management Operations

To get started working with the RabbitMQ management interface, the management interface must be enabled via a plugin. Execute the following to enable the management plugin:

sudo rabbitmq-plugins -n cloudify-manager@localhost enable rabbitmq_management

Once this is complete, there will be a management web interface located at http://:15672/

In order to utilize the web interface, you will need to have the RabbitMQ username and password for authentication. This can be found in the /etc/cloudify/config.yaml file used for instantiation a Cloudify manager.

By default, the user created from the manager instantiation process does not have sufficient permissions to be used with the web interface. Use the following command to promote the default user with the “monitoring” permission (or you can alternatively assign the “administrator” tag).

sudo rabbitmqctl set_user_tags <username> monitoring

You can now use the RabbitMQ username and password to log in via the web interface to do actions such as view queues, get messages, monitor performance, and monitor connections.

Celery

**System Service ** The best way to tell if Celery is alive and healthy is to perform a “ping”. Before working with Celery, it is necessary to know the RabbitMQ username and password for the service. Please refer to the RabbitMQ section in this document titled “Management Operations” to find your username and password. Here is how to query Celery for liveness:

  1. Go to the management worker directory: cd /opt/mgmtworker/
  2. Load the Python virtual environment: source env/bin/activate
  3. “Ping” the Celery workers: celery inspect --broker="amqp://<RabbitMQ username>:<RabbitMQ password>@localhost//" ping

A successful response will be similar to this:

_-> celery@cloudify.management: OK_ \
_pong_

InfluxDB

System Service InfluxDB exposes a RESTful API that can be used for status checking, reading/writing data, and executing SQL-like queries. To check if the service is running and is healthy, we can check the “/ping” endpoint using the default InfluxDB credentials of “root”:“root”.

Code Block 8 REST

curl 'http://localhost:8086/ping?u=root&p=root' 

A successful response would be similar to this:

Code Block 9 JSON

{"status":"ok"} 

Additional information can be found in the Influxdata documentation. There’s also a web interface that’s available for use at http://:8083/ from any system that has access to port 8083 and 8086 on the manager. The “Hostname and Port Settings” area must have the hostname set to the externally visible manager IP and the port set to 8086.

PostgreSQL

System Service

To verify if postgres is working correctly a simple select can be executed:

Code Block 10 bash

sudo -u postgres psql --port 15432 -c "select 1"

Consul

Consul status can be checked in the following way:

Code Block 11 REST

sudo curl --cacert /etc/cloudify/ssl/cloudify_internal_ca_cert.pem --cert /etc/cloudify/cluster-ssl/consul_client.crt --key /etc/cloudify/cluster-ssl/consul_client.key https://localhost:8500/v1/status/leader

Syncthing

Checking if syncthing is working correctly will need a curl command to the REST API.

Code Block 12 REST

curl -H "X-Api-Key: <key>" 127.0.0.1:8384/rest/system/status

The key can be gathered from: //configuration/gui/apikey in /opt/syncthing/.config/syncthing/config.xml

Logging

Overview

Log locations vary from service to service, but the majority of logs can be found in /var/log and /var/log/cloudify.

Within these folders are folders for each service with distinguishable names such as “rabbitmq” and “postgres”. If logs for a service aren’t found here, the next place to look would be in the service configuration defaults file for any indication of a log file path (see the section “Cloudify Service Configuration Defaults”).

Cloudify Agent Worker Logs

Cloudify agent worker logs can be found on deployed instances / virtual machines with an installed Cloudify agent. Typically, the logs are stored in the Cloudify agent user’s home directory in a folder named after the node instance ID for the instance / VM.

Cloudify Management Worker Logs

Cloudify REST API Service Logs

PostgreSQL Logs

RabbitMQ Logs

Cluster Logs