Real-Time Applications Metrics now Available

Glorious day, we are bringing real-time application metrics to all our users.

Some of you were already Beta testers of our new Metrics solution so a big thank you is in order. Thanks to you and our amazing team, we are officialy rolling out Metrics for all our users today.

Current State

Right now if you want to get an idea of the metrics of your application, you have limited choices. You can use a third party tool and manage it yourself. We have several users happily doing that with New Relic. Or you can SSH to your application instance and use local tools like HTOP. It's not a satisfying situation and we knew we needed to fix it.

Our internal monitoring system (you know that thing that allows your application to restart automatically among other things) was also slowly coming to its limit. So we had to find a new solution.

Enter Clever Cloud Metrics

An original name for an original use case.

Clever Cloud Metrics is the name of our new monitoring solution. It is based on the GeoSpatial Timeseries Database Warp10, Warp10 itself being based on Hadoop and Kafka. This is now what we use on a global scale to monitor all our infrastructure and your applications; to store all the hardware metrics we could get from your VMs. If you are wondering why we made this choice, it's based on mostly two things.

We had scale issues with most known solutions. Without calling them out, all the new ones try to reimplement their own clustering logic. Warp10 clustering is managed by ZooKeeper (remember it's Hadoop and Kafka), a proven solution that has been in production for years. Granted, it's not easy to manage. Good news is it's our job and that's why you chose us.

The other appealing reason to us was the Timeseries specialization. It's perfect for monitoring. When you use a regular datastore, with its regular queryability, it can be hard to do proper time series analysis. To put it simply, SQL does not cut it. Warpscript does:

WarpScript is an extensible stack oriented programming language which offers more than 600 functions and several high level frameworks to ease and speed your data analysis. Simply create scripts containing your data analysis code and submit them to the platform, they will execute close to where the data resides and you will get the result of that analysis as a JSON object that you can integrate into your application.

This happens to be exactly what we need. And they made it super easy to use thanks to Quantum, their query interface based on Polymer (Which means it's also dead easy to integrate in existing webapps). You will be able to use something similar from the Clever Cloud console to query the metrics of your applications. If you are interested, know that the console's dashboard is made with MetricsGraphics.js.

How it Works

From now on you have a new tab in the console called Metrics. It will show you real time graphs of your scalers' hardware resources:

If you don't see anything it's probably because you have not enabled Metrics for your application. To do so you need to add the environment variable ENABLE_METRICS=true.

So what exactly do you have access to?

Hardware Metrics

Our default configuration gives you access to a variety of hardware metrics like CPU, RAM, disk IO, network IO. This is of course available for all the scalers of your application.

Additional Metrics

Depending on the language and runtime you are using, you also get access to specific metrics. For instance if you are running on top of the JVM and have activated JMX (the classic protocol for everything metrics in the JVM) you will see all of the metrics exposed by JMX. This also works for Haskell's EKG. If your favorite is missing, please let us know so we can add its support.

Here is an example of a PHP application that gives you by default the number of idle and active workers as well as the number of requests per second.

All these metrics are collected with Statsd. Which means you can also send your own metrics. It's particularly useful to measure business events. A simple example can be a counter incremented each time a specific API is called.

$statsd = new League\StatsD\Client();
$statsd->configure(array(
    'host' => '127.0.0.1',
    'port' => 8125,
));
$statsd->increment('myApi.pageview');

To send StatsD data, you need a StatsD client configured for '127.0.0.1' as host and '8125' as port. Here I am using one that is nicely integrated to Lumen from the PHP League. Every metric you record yourself are accessible with the statsd prefix. To see it, simply click on Advanced on the top-right corner of the screen, you will be presented with a dropdown:

Advanced Usage

Sometimes you want to go the extra step. If you select Custom View in the drop down you will be redirected to the following screen:

Click on the Quantum perma link and you will be redirected to our Quantum instance, ready to execute custom queries. From there you can do pretty much whatever you want. Before trying anything if you are not familiar with Stack-based languages and Warpscript please visit their documentation.

What's to come

We plan to give you more metrics of course. Wether it comes from our reverse proxy to give you insights on all the requests that go through it or from other runtime frameworks of the like of JMX and EKG. For this we welcome suggestions. Anything you think should be a part of a default monitoring dashboard for this very language or framework, please let us know.

We plan to support the '/_metrics' used by Prometheus. The idea is for our monitoring to regularly poll this endpoint that should send back a JSON containing metrics.

While you can already access the metrics API, in the future you will be able to directly write your own metrics as well.

Please keep in mind that nothing is written in stone for roadmap and things might change 🙂

We really hope you like this new feature and would love to get your feedback. Please tell us what you think 🙂

Some resources to get you started

We plan to add more of these but here are sample projects that write to a local StatsD intance. There are many others available on Github:

Blog

À lire également

SuperBOL: The COBOL revolution in the Cloud

COBOL, a programming language that is over 60 years old, continues to power a large proportion of the IT systems of the world's major companies, particularly in the financial and insurance sectors.
Features

Clever Cloud welcomes the first startups to the UP Programme

Clever Cloud is proud to announce the arrival of the first five startups selected to join its UP Programme, an initiative dedicated to supporting young technology companies in their growth phase.
Company

A minor update resulted in a cascade of errors: how it went wrong, what we’ve learnt

On Friday, August 2nd, 2024 Clever Cloud’s platform became very unstable, leading to downtime of varying duration and scope, for customers using services on the EU-FR-1 (PAR) region, and remote zones depending on the EU-FR-1 control plane (OVHcloud, Scaleway, and Oracle). Privates and on-premise zones weren’t impacted.
Company Engineering