Using Flight Recorder with Payara Server 5

Photo of Rudy De Busscher by Rudy De Busscher

There are a lot of monitoring and alert mechanisms available within Payara Server. For example, it is possible to report user requests or database calls that take too long, or to report when high CPU or high memory usage occurs. But it is sometimes not easy to identify 'why' a request takes such a long time. 

With the Java Flight Recorder tool, just as those that you can find in airplanes, you keep track of many values during the time Payara Server is up and running and processing requests. When there is an incident, you immediately have all the information at hand to diagnose the problem.

In this blog, we describe some use cases for the Java Flight Recorder and how it can be used to profile your application. You can also view the following video tutorial for information:

History of Java Flight Recorder

The Java Flight Recorder is born out of the JRockit VM. It was a proprietary implementation of the VM by Oracle through the acquisition of BEA Systems. JRockit VM is no longer maintained by Oracle but they ported the idea of the airplane Flight Recorder to the commercial Oracle JVM. The initial goal was to have the means to collect data about areas within the JVM that could be improved.

The technology was open-sourced in 2018 and integrated into OpenJDK 11. So every JVM built on the OpenJDK 11 codebase has the Flight Recorder included. But if you are using the Zulu JDK (included with Payara Enterprise), you can also use the Flight Recorder and Mission Control program from the 1.8u202 based version onwards (Zulu 8.35).

Due to this long history, Java Flight Recorder is deemed a reliable technology that has proven itself and experienced further optimization over the years. This means you can have it activated all the time and experience minimal impact on performance. It's estimated that the flight recorder has about a 1% performance impact.

Flight Recorder Basics

The Flight Recorder works based on events. These can be produced by the JVM itself or can be created by the application code. Further on we see an example of how you can code your custom event.

There are three types of events:

  • A duration event takes some time to occur, and is logged when it completes. You can set a threshold for duration events, so that only events lasting longer than the specified period are recorded.
  • An instant event occurs instantly, and is logged right away.
  • A sample event (also called requestable event) is logged at a regular interval. You can configure how often sampling occurs.

These events are generated all the time, but they are not recorded until you instruct the JVM to do so. These recordings are assembled at several levels to make sure that the overhead for the JVM is minimal and impact on your application is negligible.

First, they are stored at the thread level, for each thread that generates them. In the next step, they are assembled at a global level. This happens when the storage at the thread level is full or at specific intervals. The contents of this Global Buffer can be written to disk. It is the most expensive operation and thus should be avoided.

You can also choose to keep all information in the Global Buffer. It acts as a circular buffer and keeps the most recent information and will never grow larger than the specified size. When you want to consult the event information, you can retrieve it by dumping it to a file, for example.

This is the recommended approach, to gather all Flight Recorder events in the Global Buffer and when an incident happens, dump the contents for analysis.

The stored data is in a binary format which is self containing. All information to interpret the events is contained in the file itself. This allows any software, like the Java Mission Control tool, to load the Flight Recorder file and present it.

How to Assemble Recordings

The events are recorded when the JVM is instructed to do so. Let us explore how you can assemble them:

A recording with the default options can be started when adding the following JVM option to Payara Server.

-XX:StartFlightRecording

Once the JVM has launched, you can verify the options of all the active recordings of Payara Server for example, because there can be multiple, with the command:

>jcmd ASMain JFR.check

21205:
Recording 1: name=1 maxsize=250,0MB (running)

You can see that the default size of the circular buffer is 250 MB. You can change that by specifying options to the JVM option. The following example enlarges the buffer to 1GB:

-XX:StartFlightRecording=maxsize=1024m,dumponexit=true,filename=dump.jfr

It also instructs the system to write the recording to disk when the JVM crashes and the file name it will use in that case. This file can give you valuable information on the cause of your system crash.

If we want to access the Event information at the same time the system is running, we can execute the following command to dump the Global Buffer to disk:

jcmd ASMain JFR.dump name=1 filename=payara.jfr

If we do not want to have the collection of the events all the time, it can be activated or stopped at any time. But do remember that the Java Flight Recorder is designed to be active all the time and if it is stopped, you might not have the required valuable information to investigate the problem that just occurred.

jcmd ASMain JFR.start
jcmd ASMain JFR.stop name=1

The value of the name option in the stop command needs to match an active recording. If you are not sure what the current name of the active recording is, you can list it through:

jcmd ASMain JFR.check

Custom Events

By default, many events related to the JVM are available within the Flight Recorder. With the creation of custom events, you can make it useful for your application.

When we create a duration event for each JAX-RS request which is processed by the server, we could investigate what causes a slow response.

Custom events can be created by extending the abstract class jdk.jfr.Event.

@Name(JaxRsInvocationEvent.NAME)
@Label("JAX-RS Invocation")
@Category("JAX-RS")
@Description("Invocation of a JAX-RS resource method")
@StackTrace(false)
class JaxRsInvocationEvent extends Event {

@Label("Resource Method")
public String method;

@Label("Media Type")
public String mediaType;

@Label("Java Method")

public String javaMethod;

...

}

The class and fields should have a set of annotations that describe the event. That way, the recording file is self-describing and you see the proper labeling of all fields within The Mission Control tool when a recording is analysed for example.

Also, the creation of our JAX-RS event is quite straight forward using a JAX-RS filter:

@Provider
public class JFRFilter implements ContainerResponseFilter, ContainerRequestFilter {

@Override
public void filter(ContainerRequestContext requestContext) throws IOException {
JaxRsInvocationEvent event = new JaxRsInvocationEvent(); event.begin();

if (!event.isEnabled()) {
return;
}
requestContext.setProperty(JaxRsInvocationEvent.NAME, event);

}

@Override
public void filter(ContainerRequestContext requestContext, ContainerResponseContext responseContext) throws IOException {

JaxRsInvocationEvent event = (JaxRsInvocationEvent) requestContext.getProperty(JaxRsInvocationEvent.NAME);

if (event == null || !event.isEnabled()) {
return;
}

event.end();

event.method = requestContext.getMethod();
// Other event values

event.commit();
}

}

The recording will also now contain our custom event. If we create similar events for external endpoints that are called, and for the database access, for example, we have all that information available, together with information about the JVM as CPU and Garbage collection info, when we analyse why certain requests take more time. And having all that information available within the same format and tool makes it easier to make connections and uncover the real problem.

Conclusion

The Java Flight Recorder is designed to run continuously to gather the values of critical parts of the JVM. Due to its long history, the implementation is optimized so that a very low overhead occurs when gathering the events.

These events, in combination with your custom created events, can be used to analyse the circumstances when an issue occurred in your production server. It can give you valuable information on JVM internals to find out why some of your user responses take more time or the server crashed due to an OutOfMemory Exception.

 Payara Platform  Download Here 

 

Comments