Cloudify System Processes and Logging Guide

Get the latest docs.

You are looking at documentation for an older release. Not what you want? Use the version selector on the top banner or click here.

The purpose of this document is to provide detailed information for:

Identifying Cloudify Manager’s processes
Defining how these processes should be tracked for monitoring and alerting
Defining locations of Cloudify Manager log files

Cloudify System Processes

In a Cloudify Manager environment, the following system processes exist:

User	Command	Description
cfyuser	nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf	Nginx web server (REST API) root process
nginx	nginx: worker process	Nginx web server (REST API) child process
stage_u+	/opt/nodejs/bin/node /opt/cloudify-stage/backend/server.js	React.js web application (Cloudify front-end web UI)
cfyuser	/opt/mgmtworker/env/bin/python /opt/mgmtworker/env/bin/celery worker	Cloudify Manager management worker Celery processes
amqpinf+	/opt/amqpinflux/env/bin/python /opt/amqpinflux/env/bin/cloudify-amqp-influxdb	Cloudify-specific RabbitMQ-to-InfluxDB transport
influxdb	/usr/bin/influxdb -config=/opt/influxdb/shared/ config.toml	InfluxDB
rabbitmq	su rabbitmq -s /bin/sh -c /usr/lib/rabbitmq/bin/rabbitm q-server	RabbitMQ service
cfyuser	/opt/manager/env/bin/python /opt/manager/env/bin/gunicorn	Gunicorn HTTP server
postgres	/usr/pgsql-9.5/bin/postgres -D /var/lib/pgsql/9.5/data	PostgreSQL database
cfyuser+	/opt/consul/consul agent -config-dir /etc/consul.d	Consul
cfyuser	/opt/manager/env/bin/python /opt/manager/env/bin/consul_watcher	Consul-watcher

Cloudify SysV Init Services

Service	Description
cloudify-amqpinflux	Cloudify-specific RabbitMQ-to-InfluxDB transport service
cloudify-influxdb	InfluxDB service
cloudify-mgmtworker	Cloudify Manager management worker Celery service
cloudify-rabbitmq	RabbitMQ service
cloudify-restservice	Cloudify REST service
cloudify-riemann	Cloudify policy manager
cloudify-stage	Cloudify UI service
cloudify-check-runner.service	check runner
cloudify-consul-watcher.service	consul members watcher
cloudify-consul.service	consul
cloudify-handler-runner.service	handler runner
cloudify-postgresql.service	PostgreSQL 9.5 database server
cloudify-syncthing.service	Files syncthing

Cloudify Service Configuration Defaults

All Cloudify-specific service configurations can be found in /etc/sysconfig. This area is where default configuration data can be found as well as logging locations for service-specific troubleshooting. These are very useful when trying to understand how a service was instantiated and what logging configuration is being used.

This directory can also be used to derived each core service’s SysV init name. For instance, enumerating /etc/sysconfig will show a file called cloudify-amqpinflux. This is the name of the service, and thus to query the service status can be done using the command service cloudify-amqpinflux status.

Discovering Cloudify Services and Service

Statuses

The sections above describe how to identify a Cloudify service by looking directly at the output
of something like ps or by folder snooping. This is not always practical or desired and there are
other, more developer-friendly, ways of enumerating which Cloudify services are present and
how to harvest information about them.

The best starting point is to utilize the REST API of the manager to get service information.
Simply craft a GET request for the status endpoint: GET /status HTTP/1.1
If cURL and Python are available, it’s very easy to make the request as well as visualize the
returned data.

Code Block 1 REST

curl -X GET http://<manager-ip>/status | python -m json.tool

An example, partial, return is as follows:

Code Block 2 JSON

{
"services": [{
"display_name": "InfluxDB",
"instances": [{
"ActiveState": "active",
"Description": "InfluxDB Service",
"Id": "cloudify-influxdb.service",
"LoadState": "loaded",
"MainPID": 13129,
"SubState": "running",
"state": "running"
}
]
}, {
"display_name": "Celery Management"
"instances": [{
"ActiveState": "active",
"Description": "Cloudify Management Worker Service",
"Id": "cloudify-mgmtworker.service",
"LoadState": "loaded",
"MainPID": 18161,
"SubState": "running",
"state": "running"
}
]
}, {
"display_name": "RabbitMQ",
"instances": [{
"ActiveState": "active",
"Description": "RabbitMQ Service",
"Id": "cloudify-rabbitmq.service",
"LoadState": "loaded",
"MainPID": 12322,
"SubState": "running",
"state": "running"
}
]
},
]
}

With this information, in standard JSON format, it is easy to match a core Cloudify service with a
system-level process ID (MainPID) to begin further troubleshooting.

Cluster status

Cloudify provides API to get information about current cluster state. Using first it is easy to determine if the manager is in a cluster state.

Code Block 3 REST

curl -u user:password "http://<manager-ip>/api/v3.1/cluster"

Code Block 4 JSON

{
"initialized": true,
"consul": {
"leader": "172.20.0.3:8300"
},
"error": null,
"logs": [
{
"message": "HA Cluster configuration complete",
"timestamp": 1485778546965628,
"cursor": "opaque cursor value"
}
]
}

While the second call response provides information about each node in the cluster and indicates which node is the current master.

Code Block 5 REST

curl --header -u user:password "http://<manager-ip>/api/v3.1/cluster/nodes"

Code Block 6 JSON

{
"items":
[
{
"initialized": true,
"online": true,
"master": true,
"host_ip": "172.20.0.2",
"name": "cloudify_manager_LMJZA2",
"credentials": "<REDACTED>"
}
]
}

Checking Manager Components

RabbitMQ

System Service

To check if the RabbitMQ broker is running (and to see many other details such as which applications are running, memory allocation, and other performance metrics), simply run the following command:

Code Block 7 bash

sudo rabbitmqctl -n cloudify-manager@localhost status

An error message will be presented if the service has an issue such as a failed broker.

Management Operations

To get started working with the RabbitMQ management interface, the management interface must be enabled via a plugin. Execute the following to enable the management plugin:

sudo rabbitmq-plugins -n cloudify-manager@localhost enable rabbitmq_management

Once this is complete, there will be a management web interface located at http://:15672/

In order to utilize the web interface, you will need to have the RabbitMQ username and password for authentication. This can be found in the /etc/cloudify/config.yaml file used for instantiation a Cloudify manager.

By default, the user created from the manager instantiation process does not have sufficient permissions to be used with the web interface. Use the following command to promote the default user with the “monitoring” permission (or you can alternatively assign the “administrator” tag).

sudo rabbitmqctl set_user_tags <username> monitoring

You can now use the RabbitMQ username and password to log in via the web interface to do actions such as view queues, get messages, monitor performance, and monitor connections.

Celery

**System Service ** The best way to tell if Celery is alive and healthy is to perform a “ping”. Before working with Celery, it is necessary to know the RabbitMQ username and password for the service. Please refer to the RabbitMQ section in this document titled “Management Operations” to find your username and password. Here is how to query Celery for liveness:

Go to the management worker directory: cd /opt/mgmtworker/
Load the Python virtual environment: source env/bin/activate
“Ping” the Celery workers: celery inspect --broker="amqp://<RabbitMQ username>:<RabbitMQ password>@localhost//" ping

A successful response will be similar to this:

_-> celery@cloudify.management: OK_ \
_pong_

InfluxDB

System Service InfluxDB exposes a RESTful API that can be used for status checking, reading/writing data, and executing SQL-like queries. To check if the service is running and is healthy, we can check the “/ping” endpoint using the default InfluxDB credentials of “root”:“root”.

Code Block 8 REST

curl 'http://localhost:8086/ping?u=root&p=root'

A successful response would be similar to this:

Code Block 9 JSON

{"status":"ok"}

Additional information can be found in the Influxdata documentation. There’s also a web interface that’s available for use at http://:8083/ from any system that has access to port 8083 and 8086 on the manager. The “Hostname and Port Settings” area must have the hostname set to the externally visible manager IP and the port set to 8086.

PostgreSQL

System Service

To verify if postgres is working correctly a simple select can be executed:

Code Block 10 bash

sudo -u postgres psql --port 15432 -c "select 1"

Consul

Consul status can be checked in the following way:

Code Block 11 REST

sudo curl --cacert /etc/cloudify/ssl/cloudify_internal_ca_cert.pem --cert /etc/cloudify/cluster-ssl/consul_client.crt --key /etc/cloudify/cluster-ssl/consul_client.key https://localhost:8500/v1/status/leader

Syncthing

Checking if syncthing is working correctly will need a curl command to the REST API.

Code Block 12 REST

curl -H "X-Api-Key: <key>" 127.0.0.1:8384/rest/system/status

The key can be gathered from: //configuration/gui/apikey in /opt/syncthing/.config/syncthing/config.xml

Logging

Overview

Log locations vary from service to service, but the majority of logs can be found in /var/log and /var/log/cloudify.

Within these folders are folders for each service with distinguishable names such as “rabbitmq” and “postgres”. If logs for a service aren’t found here, the next place to look would be in the service configuration defaults file for any indication of a log file path (see the section “Cloudify Service Configuration Defaults”).

Cloudify Agent Worker Logs

Cloudify agent worker logs can be found on deployed instances / virtual machines with an installed Cloudify agent. Typically, the logs are stored in the Cloudify agent user’s home directory in a folder named after the node instance ID for the instance / VM.

The Celery service SysV Init file is /etc/init.d/celeryd-.
The Celery service config file is /etc/default/celeryd-.
Cloudify agent worker log. ~//work/.log
This is the agent counterpart to the Cloudify Management Worker logs. ~//work/-.log
Worker-specific log.
- Each Celery worker gets its own numbered log file. ~//work/%I.log
Celery daemon / service logs

Cloudify Management Worker Logs

/var/log/cloudify/mgmtworker/cloudify.management_worker.log
- Cloudify management worker log.
- Useful for troubleshooting management worker issues such as Cloudify agent deployment, blueprint deployment creation, and heartbeat errors.
- Contains information about deployment executions from the perspective of the management worker.
- Shows worker tracebacks.
- Task execution logs are followed by noting the task dispatch ID (a UUID). Task IDs can also be found in execution logs and used to search this worker log for further details. Specific task logs will have prefixes of “Received task”, “Task accepted”, and “Task [succeeded | failed]“. Here’s an example:
Code Block 13 LOG
```
Received task: cloudify.dispatch.dispatch[b164cf2c-d601-4484-bbce-927e1106de27]
Task accepted: cloudify.dispatch.dispatch[b164cf2c-d601-4484-bbce-927e1106de27] pid:5683
Task cloudify.dispatch.dispatch[b164cf2c-d601-4484-bbce-927e1106de27] succeeded in 1.015225859s
```
/var/log/cloudify/mgmtworker/logs/.log
- Cloudify deployment worker log.
- Useful for troubleshooting deployment executions of all types. Low-level logging of worker tasks and is generally used as an additional source of information if the execution logs themselves aren’t sufficient.
- Shows worker tracebacks.

Cloudify REST API Service Logs

/var/log/cloudify/rest/cloudify-rest-service.log
- Serves as a central log file for all incoming and outgoing REST API requests and responses. Log entries are in a well-defined, human-readable format.
- Provides a host of useful information such as request details (HTTP method, headers, query string details, JSON data, endpoint path, etc…) and response details (HTTP status, headers).
- Can be monitored, on-demand, for bad HTTP response codes, blueprint file names, endpoint security checks, etc
/var/log/cloudify/rest/gunicorn-access.log
- Verbose access logs directly from the HTTP server itself.
- Well-structured, dense logging format.
- Useful for monitoring REST API interaction, user fingerprinting, and this log file includes maintenance endpoint calls and other “internal” endpoints that Cloudify uses.
/var/log/cloudify/rest/gunicorn.log
- Gunicorn HTTP server system service log.
- Useful for troubleshooting SysV init service failures as well as enumerating the
- HTTP server worker process IDs and HTTP server listening endpoint.

PostgreSQL Logs

/var/log/cloudify/postgresql
- PostgreSQL system service log.
- Useful for gathering information about the PostgreSQL service such as version, process ID, build, and cluster information.
- Useful for monitoring cluster state and indexing tasks.
- Useful for PostreSQL service troubleshooting.

### RabbitMQ Logs

/var/log/cloudify/rabbitmq/.log
- RabbitMQ system service log.
- Useful for gathering information about the RabbitMQ service such as node name, config file locations, database directory, and running reporting info.
- Useful for RabbitMQ service troubleshooting.

### Cluster Logs

/var/log/cloudify/cloudify-cluster.log
- Cluster services log.
- All cluster services log to this file and journald.
- Useful for gathering information about Cluster operations.