In a distributed micro-services architecture, it is important to have an overview of your systems in terms of CPU, memory management and other important metrics.
This is called Observability, measuring the internal state of a system, in this case, the micro-services instances.
These metrics are gathered centrally, so that you can instantly have an overview of all the values and it allows the determination (mostly automated based on some rules) of the health of the instances.
MicroProfile Metrics allows you to expose some custom defined metrics, but also exposes system metrics, related to CPU and memory, for example.
The goal of MicroProfile Metrics is to expose monitoring data from the implementation in a unified way. It also defines a Java API so that the developer can define and supply his own values.
Exposing a value is very easy. The only thing which needs to be done is to annotate a field or method (returning some primitive value) with @Gauge
MicroProfile metrics has also other annotations to define some timing information on JAX-RS endpoints, for example:
- @Metered: A meter measures the rate at which the endpoint is called.
- @Timed: It is a timer that tracks the duration of the request.
- @Counted: A counter is a simple incrementing value.
By default, it also exposes some system values related to
- CPU usage
- Heap memory
- Garbage collection
- Java Threads
- etc …
MicroProfile Fault Tolerance
MicroProfile Fault Tolerance has a lot of nice features to improve the fault tolerance and resilience of applications. Concepts included are:
- TimeOut: Define a duration for the timeout
- RetryPolicy: Define criteria on when to retry
- Fallback: Provide an alternative solution for a failed execution.
- Bulkhead: isolate failures in part of the system while the rest part of the system can still function.
- CircuitBreaker: offer a way to fail fast by automatically failing execution to prevent the system overloading and indefinite wait or timeout by the clients.
Since version 1.1 (contained in MicroProfile 1.4 and 2.0) it also has an integration with MicroProfile metrics.
Interesting statics, like the number of retries, number of times the circuit breaker was open, number of calls to fallback method, etc are gathered and exposed.
Prometheus is the most popular Open-Source product for gathering metrics. It was started in 2012 by SoundCloud (online audio distribution platform and music sharing) and is in 2018 graduated at the Cloud Native Computing Foundation.
You can see it as a database for storing time series but has many more features
- Multi-dimensional data model with time series
- Query Language
- Pull data from metric sources
- Alert Manager
Therefore, MicroProfile Metrics exposes the values in the Prometheus format by default. So it can be consumed easily by the scrapers.
The following steps describe how you can deploy the Payara Server which contains all the MicroProfile implementations, including the one for Metrics and Fault Tolerance, into a docker environment together with a Prometheus instance to gather the metrics.
As an example application, the demo code used in the presentation “Deploy, monitor, and take control of your Micro-Services with MicroProfile” which explores the same topic, can be used. You can find the source in this GitHub repository.
To make it easier to connect the different docker instances, create a specific network in docker with the following command.
docker network create demo-net
Create Application Image
For our tests, we are creating a specific image which is based on the official Payara Docker image where we add the WAR file with our application.
COPY ./target/monitoring.war $DEPLOY_DIR
You can execute the following commands in a terminal from the root of the Maven project (also containing the DockerFile).
mvn clean package
And then to create the image with
docker build -t demo/service .
Startup the Image with the Application
Now that we have the Docker image, let start up a container with this image.
docker run -d -p 8080:8080 --name service --net demo-net demo/service
The name service is here important as it is used in the Prometheus configuration file. We have defined that it looks up the application through DNS by using the names service.
You can verify if everything is ok by calling the following URL in your browser : http://localhost:8080/monitoring/rest/hello
Create Prometheus Image
We are creating a special Docker image containing the Prometheus server (from Adam Bien's Dockland) which contains also our Prometheus configuration.
The Dockerfile content:
COPY prometheus.yml .
The configuration file contains the following lines:
- job_name: 'payara5'
- targets: ['service:8080']
Run the build image command from the directory containing both files:
docker build -t demo/prometheus .
Startup the Image with Prometheus
Now that we have Prometheus Docker image, let start the container with this image:
docker run -d -p 9090:9090 --name prometheus --net demo-net demo/prometheus
You can verify if the connection with the application works with the following steps:
1. Open the browser with the URL http://localhost:9090.
2. Select the metrics vendor:system_cpu_load in the drop-down.
3. Put some load on your machine with the badly written (on purpose) multi-threaded check with prime numbers on http://localhost:8080/monitoring/rest/prime.
4. You should see the spike in CPU usage after you have pressed the execute button.
Gain Better Insight Into Application Resilience
The metrics exposed by MicroProfile Metrics are in the Prometheus format by default. This makes it very easy to gather all your custom and system defined values within Prometheus. You just need to point a scraper to the running instance of Payara.
Also MicroProfile Fault Tolerance has support for Metrics so that we can have better insight into the resilience of our applications.