Metrics

Warning:

Clever Cloud Metrics is still in beta.

In addition to logs, you can have access to metrics to know how your application behaves. By default, system metrics like CPU and RAM use are available, as well as application-level metrics when available (apache or nginx status for instance).

Publish your own metrics

We currently support two ways to push / collect your metrics: the statsd protocol and Prometheus.

The statsd server listens on port 8125. You can send metrics using regular statsd protocol or using an advanced one as described here.

We also support Prometheus metrics collection. By default our agent collects exposed metrics on localhost:9100/metrics.

If needed, you can override those settings with the two following environment variables:

  • CC_METRICS_PROMETHEUS_PORT: Define the port on which the Prometheus endpoint is available
  • CC_METRICS_PROMETHEUS_PATH: Define the path on which the Prometheus endpoint is available

Display metrics

For each application, there is a Metrics tab in the console.

Overview pane

To get a quick overview of the current state of your scalers, the overview pane displays the current CPU, RAM, Disk and Network activity. On supported platforms, you can also have access to requests / second, and GC statistics.

Advanced pane

Advanced metrics allow you to access all gathered metrics, on a specified time range.

Custom queries

All metrics are stored in Warp10, so you can explore data directly with the quantum interface, with WarpScript. For instance, you can derive metrics over time, do custom aggregations or combine metrics.

Access Logs metrics

All your applications access logs are pushed to Warp10. You are now able to process them directly in the console in the Metrics tab of your applications.

Access Log data model

Access logs are defined in the 'accessLogs' Warp10 class and there are three Warp10 labels available:

  • owner_id: Organisation ID
  • app_id or addon_id: Application ID or Addon ID
  • adc or sdc
    • adc (Application Delivery Controller) are used for HTTP connections
    • sdc (Service Delivery Controller) are used for TCP connections

Available addons for the field addon_id are mysql, redis, mongodb and postgresql addons.

WARNINGS

  • Add-ons on shared plans (usually DEV plans) do not provide access logs
  • There are no recorded access logs in case of a direct access to an add-on

To reduce space used to store access logs, we defined the following key-value models.

Key-Value model for applications

AccessLogs data models for application. Using HTTP protocol.

t -> timestamp
a -> appId or addonId
o -> ownerId
i -> instanceId
ipS -> ipSource
pS -> portSource # 0 if undefined
s -> source
  lt -> latitude
  lg -> longitude
  ct -> city
  co -> country
ipD -> ipDestination
pD -> portDestination # 0 if undefined
d -> destination
  lt -> latitude
  lg -> longitude
  ct -> city
  co -> country
vb -> verb
path -> path
bIn -> bytesInt
bOut -> bytesOut
h -> hostname
rTime -> responseTime
sTime -> serviceTime
scheme -> scheme
sC -> statusCode
sT -> statusText
tS -> Haproxy termination_state
adc -> Reverse proxy hostname
w -> workerId (Sozu)
r -> requestId (Sozu)
tlsV -> tlsVersion (Sozu)

Key-Value model for addons

AccessLogs data models for addons. Using TCP protocol.

t -> timestamp
a -> appId or addonId
o -> ownerId
i -> instanceId
ipS -> ipSource
pS -> portSource # 0 if undefined
s -> source
  lt -> latitude
  lg -> longitude
  ct -> city
  co -> country
ipD -> ipDestination
pD -> portDestination # 0 if undefined
d -> destination
  lt -> latitude
  lg -> longitude
  ct -> city
  co -> country
tS -> Haproxy termination_state
sdc -> Reverse proxy hostname
sDuration -> total session duration time in millis

Queries examples:

The main ways to use accessLogs data is to FETCH over it and get interesting values by a JSON processing.

Note:

Look at fetch_accessLogs_key_v0 macro to have a convenient way to explore access log data. Documentation there.

A convenient way to integrate the intercepted data in a workflow is to use warpscript. It is a good idea to use the GTS format to be able to apply all GTS transformation on the output.

In the following example, we get the accessLogs status codes and create a GTS as an output to be able to use FILTER or any other transformation on it a second time.

An example using the provided Clever Cloud macro to straightforward access to the access logs input byte :

  '<READ TOKEN>' { 'app_id'  'id' } 'bIn' NOW 1 h  @clevercloud/fetch_accessLogs_key_v0

or to get the latitude of the destination, which is a nested data:

  '<READ TOKEN>' { 'app_id'  'id' } 'd.lt' NOW 1 h  @clevercloud/fetch_accessLogs_key_v0

Monitoring' metrics

All applications and VMs instances behind are monitored. Data is sent to Warp10, a Geotimes series database. All metrics can be processed directly in the console in the Metrics tab of your applications or by the Clever Cloud Warp10 endpoint.

Monitoring data model

All metrics data follow the same schema in warp10. Each class represents a specific metric. The context is provided by the warp10 labels.

Class values and Labels

Overview

A telegraf daemon supplies most metrics.

Each metric is recorded as a warp10 class. Labels provide additional information about the VMs like instances id, organisation id, reverse proxy used.

Labels

In metrics' data, mains labels would be :

  • owner_id : A unique ID by organisation
  • app_id : A unique ID of application
  • host : HV id hosting the VM instance
  • adc : Reverse proxy ID for http connexion (ie: applications)
  • sdc : Reverse proxy ID for tcp connexion (ie: addons)
  • vm_type : volatile or persistent. Is it a stateless application or a stateful add-on
  • deployment_id : ID of the deployment

Note:

For some specific metrics. Some labels could miss.
Classes

Telegraf provide lots of metrics described in their documentation.

Below, the list of all warp10 classes representing Telegraf metrics :

conntrack.ip_conntrack_count mem.swap_free
conntrack.ip_conntrack_max mem.swap_total
cpu.usage_guest mem.total
cpu.usage_guest_nice mem.used
cpu.usage_idle mem.used_percent
cpu.usage_iowait mem.vmalloc_chunk
cpu.usage_irq mem.vmalloc_total
cpu.usage_nice mem.vmalloc_used
cpu.usage_softirq mem.wired
cpu.usage_steal mem.write_back
cpu.usage_system mem.write_back_tmp
cpu.usage_user net.bytes_recv
disk.free net.bytes_sent
disk.inodes_free net.drop_in
disk.inodes_total net.drop_out
disk.inodes_used net.err_in
disk.total net.err_out
disk.used net.packets_recv
disk.used_percent net.packets_sent
http_response.http_response_code net_response.response_time
http_response.response_time net_response.result_code
http_response.result_code net_response.result_type
http_response.result_type netstat.tcp_close
kernel.boot_time netstat.tcp_close_wait
kernel.context_switches netstat.tcp_closing
kernel.entropy_avail netstat.tcp_established
kernel.interrupts netstat.tcp_fin_wait1
kernel.processes_forked netstat.tcp_fin_wait2
mem.active netstat.tcp_last_ack
mem.available netstat.tcp_listen
mem.available_percent netstat.tcp_none
mem.buffered netstat.tcp_syn_recv
mem.cached netstat.tcp_syn_sent
mem.commit_limit netstat.tcp_time_wait
mem.committed_as netstat.udp_socket
mem.dirty processes.blocked
mem.free processes.dead
mem.high_free processes.idle
mem.high_total processes.paging
mem.huge_page_size processes.running
mem.huge_pages_free processes.sleeping
mem.huge_pages_total processes.stopped
mem.inactive processes.total
mem.low_free processes.total_threads
mem.low_total processes.unknown
mem.mapped processes.zombies
mem.page_tables procstat_lookup.pid_count
mem.shared system.load1
mem.slab system.load1_per_cpu
mem.swap_cached

Examples and usages

From the metrics tab on the console. You can either open a Quantum console, an online warpscript editor, or either send your warpscript by your own way on the warp10 endpoint (provided by Quantum).

More information about Quantum and Warp10 in our documentation.

For example, you could fetch the memory usage of an application for the last hour. Smoothed by a data average by minute.

Warning:

Computation can be time intensive.
// Fix the NOW timestamp to have the same on over the script
NOW 'NOW' STORE
// fetch data over 1 hour
[ <READ TOKEN> 'mem.available' { 'app_id' '<APPLICATION ID>' } $NOW 1 h ] FETCH
// Average the data by bucket of 1 min from the last point timestamped at NOW
[ SWAP bucketizer.mean $NOW 1 m 0 ] BUCKETIZE
// From the instance granularity to application granularity. Timestamps to timestamps merge
[ SWAP [ 'app_id' ] reducer.mean ] REDUCE

Consumption metric

Consumption can also be inferred by our metrics. We provide some helper macros in the Warp10 documentation.

Consumption unit is in second.

The following script provides the whole consumption from between start and end timestamps for all applications under an organisation.

'<READ TOKEN>' '<ORGANISATION ID>' <START TIMESTAMP> <END TIMESTAMP> @clevercloud/app_consumption

Custom metrics

You can expose custom metrics via statsd. These metrics will be gathered and displayed in advanced view as well. On some platforms, standard metrics published over statsd are even integrated on the overview pane.

Metrics published over statsd are prefixed with statsd.

statsd socket

To publish custom metrics, configure to use your client to push to localhost:8125 (it's the default host and port, so it should work with default settings as well).

NodeJS example

You can use node-statsd to publish metrics

// npm install node-statsd

const StatsD = require('node-statsd'),
      client = new StatsD();

// Increment: Increments a stat by a value (default is 1)
client.increment('my_counter');

// Gauge: Gauge a stat by a specified amount
client.gauge('my_gauge', 123.45);

Haskell example

In Haskell, metrics are usually gathered with EKG. The package ekg-statsd allows to push EKG metrics over statsd.

If you're using warp, you can use wai-middleware-metrics to report request distributions (request count, responses count aggregated by status code, responses latency distribution).

EKG allows you to have access to GC metrics, make sure you compile your application with "-with-rtsopts=-T -N" to enable profiling.

{-# LANGUAGE OverloadedStrings #-}

-- you need the following packages
-- ekg-core
-- ekg-statsd
-- scotty
-- wai-middleware-metrics

import           Control.Monad                   (when)
import           Network.Wai.Metrics             (WaiMetrics, metrics,
                                                  registerWaiMetrics)
import           System.Metrics                  (newStore, registerGcMetrics)
import           System.Remote.Monitoring.Statsd (defaultStatsdOptions,
                                                  forkStatsd)
import           Web.Scotty

handleMetrics :: IO WaiMetrics
handleMetrics = do
  store <- newStore
  registerGcMetrics store
  waiMetrics <- registerWaiMetrics store
  sendMetrics <- maybe False (== "true") <$> lookupEnv "ENABLE_METRICS"
  when sendMetrics $ do
    putStrLn "statsd reporting enabled"
    forkStatsd defaultStatsdOptions store
    return ()
  return waiMetrics

main = do
  waiMetrics <- handleMetrics
  scotty 8080 $ do
     middleware $ metrics waiMetrics
     get "/" $
       html $ "Hello world"
Edit me on GitHub