订阅内容

This article was originally published on Ales Nosek - The Software Practitioner.

In the previous entry to this series, we reviewed several techniques that help you to prevent event loop delays. However, even the best programmer makes mistakes. What should you do when your Vert.x application doesn’t perform as expected? How to find out what part of your code is blocking the event loop threads? In the final part of the series, we are going to focus on troubleshooting event loop delays.

The event loop thread model is vastly different from the thread-per-request model employed by standard JEE or Spring frameworks. From my experience I can report that it takes developers some time to wrap their heads around it and that at the beginning they tend to make the mistake of introducing blocking calls into the event loop’s code path. In the following sections, we will discuss several techniques of how to troubleshoot such situations.

Blocked thread checker

Vert.x comes with a built-in mechanism to detect delays on event loop and worker threads by checking the execution time of handlers that you registered with the Vert.x APIs. This mechanism operates in two steps. In the first step, Vert.x saves the timestamp of the moment when a handler starts executing. This start timestamp is saved to a storage attached to the thread that is executing the handler. Whenever the execution of the handler has completed the timestamp is reset. In the second step, Vert.x periodically checks the timestamps using a dedicated thread called vertx-blocked-thread-checker. This thread is spawned by Vert.x during the creation of the Vert.x instance for example when you call Vertx.vertx(). The vertx-blocked-thread-checker thread can be seen in VisualVM:

image

The blocked thread checker serves as a watchdog that periodically checks the Vert.x threads. It iterates over all Vert.x threads and for each thread it subtracts the threads start timestamp from the current time to compute how long the thread has already been executing the handler code. If the execution time exceeds the specified threshold a warning message is dropped into the logs:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-5,5,main] has been blocked for 39 ms, time limit is 10 ms
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-6,5,main] has been blocked for 26 ms, time limit is 10 ms
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 31 ms, time limit is 10 ms
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 42 ms, time limit is 10 ms
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-2,5,main] has been blocked for 20 ms, time limit is 10 ms
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-4,5,main] has been blocked for 21 ms, time limit is 10 ms
Mar 01, 2019 11:53:24 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-7,5,main] has been blocked for 19 ms, time limit is 10 ms

 


You can use grep to routinely search through your application logs for this message. Vert.x can also log the entire stack trace to help you pinpoint the location in your code where your handler is blocking the thread:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Mar 24, 2019 9:34:23 AM io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-6,5,main] has been blocked for 24915 ms, time limit is 10 ms
io.vertx.core.VertxException: Thread blocked
        at java.lang.Thread.sleep(Native Method)
        at MyComputingVerticle.start(HelloServer.java:72)
        at io.vertx.core.impl.DeploymentManager.lambda$doDeploy$8(DeploymentManager.java:494)
        at io.vertx.core.impl.DeploymentManager$$Lambda$8/644460953.handle(Unknown Source)
        at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:320)
        at io.vertx.core.impl.EventLoopContext.lambda$executeAsync$0(EventLoopContext.java:38)
        at io.vertx.core.impl.EventLoopContext$$Lambda$9/1778535015.run(Unknown Source)
        at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:462)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.lang.Thread.run(Thread.java:748)

 


Note that the stack trace is generated at the moment when Vert.x detects that the threshold has been exceeded which is not necessarily the moment when the thread was actually blocking. In other words, it is probable but it is not guaranteed that the stack trace is showing the actual location where your event loop thread is blocking. You may need to examine multiple stack traces to pinpoint the right location.

 

You can tweak the watchdog check period and the warning thresholds. Here is an example:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
VertxOptions options = new VertxOptions();

// check for blocked threads every 5s
options.setBlockedThreadCheckInterval(5);
options.setBlockedThreadCheckIntervalUnit(TimeUnit.SECONDS);

// warn if an event loop thread handler took more than 100ms to execute
options.setMaxEventLoopExecuteTime(100);
options.setMaxEventLoopExecuteTimeUnit(TimeUnit.MILLISECONDS);

// warn if an worker thread handler took more than 10s to execute
options.setMaxWorkerExecuteTime(10);
options.setMaxWorkerExecuteTimeUnit(TimeUnit.SECONDS);

 // log the stack trace if an event loop or worker handler took more than 20s to execute
options.setWarningExceptionTime(20);
options.setWarningExceptionTimeUnit(TimeUnit.SECONDS);

Vertx vertx = Vertx.vertx(options);

 


Note that the first check is not executed right at the application start but is delayed by one check period. In our example the first check is executed 5 seconds after the application start followed by checks executed every 5 seconds. The concrete thresholds shown in the example worked well for one of my projects, however, your mileage may vary. Also, the very first execution of the handlers can be rather slow due to JVM class loading. Performance further improves when the JVM moves from interpreting the byte code to compiling it into the native code and running it directly on the CPU. Hence, you are more likely to hit the warning thresholds shortly after the application start than later on during the application run. It would be great if the threshold values could be dynamically adjusted to avoid the warnings before the JVM warms up. Unfortunately, there’s no way how to adjust the thresholds in runtime.

 

It goes without saying that Vert.x only checks the threads that were created as a result of calling Vert.x APIs. If you instantiate your own thread pool outside of Vert.x those threads won’t be checked. If you want Vert.x to check the threads in your custom thread pool, you can ask Vert.x to instantiate a checked thread pool for you like this:

 

1
2
// create a thread pool with 20 threads, set blocked thread warning threshold to 10 seconds
WorkerExecutor executor = vertx.createSharedWorkerExecutor("mypool", 20, 10, TimeUnit.SECONDS);

 


The good thing about the blocked thread checker is that it is able to detect thread delays regardless of whether they were caused by a call to a blocking API or by executing a compute intensive task. As such it can serve as a good indicator that there is something seriously wrong with your application.

 

Inspecting stack traces

Some event loop delays can be so subtle that they can go unnoticed by the blocked thread checker. Imagine a situation where you have a handler that causes a very short delay. The blocked thread checker won’t catch this short delay because it is not long enough to reach the threshold. However, if this handler is called very frequently, the aggregate delay caused by this handler can have a great impact on the performance of your application. How to uncover this kind of issue?

A good option is to analyze Java thread dumps by hand. You can refer to this article if you want to learn how to do it. Alternatively, you can use a Java profiler like VisualVM to find out in what parts of your code the most processing time is spent. Instead of writing a long prose about how to use VisualVM to troubleshoot a Vert.x application, I created a short video for you. You can watch this demo using JMeter and VisualVM to figure out the cause of delays of the Vert.x event loop:

 

Conclusion

In this article, we talked about the blocked thread checker as a first indicator of the event loop delays. Next, I showed you in the video how to troubleshoot event loop delays in practice using VisualVM.

I hope that I didn’t scare you throughout this series by analyzing all the things that can go wrong when working with the thread model Vert.x is based on. In reality it’s not so bad. One just has to pay attention to the event loop model while coding. The awesome performance that Vert.x applications can achieve is definitely a sufficient reward for the extra effort.

If you got some battle scars while working with the event loop thread model in Vert.x, I would be interested in hearing your stories. Also, let me know if you found the video demonstration helpful or if you have suggestions for future videos. If you have any further questions or comments, feel free to add them to the comment section below.

 

Read more about optimizing performance for the open-hybrid enterprise.

 

Connect with Red Hat Services
Learn more about Red Hat Consulting
Learn more about Red Hat Training
Join the Red Hat Learning Community
Learn more about Red Hat Certification
Subscribe to the Training Newsletter
Follow Red Hat Services on Twitter
Follow Red Hat Open Innovation Labs on Twitter
Like Red Hat Services on Facebook
Watch Red Hat Training videos on YouTube
Follow Red Hat Certified Professionals on LinkedIn


关于作者

UI_Icon-Red_Hat-Close-A-Black-RGB

按频道浏览

automation icon

自动化

有关技术、团队和环境 IT 自动化的最新信息

AI icon

人工智能

平台更新使客户可以在任何地方运行人工智能工作负载

open hybrid cloud icon

开放混合云

了解我们如何利用混合云构建更灵活的未来

security icon

安全防护

有关我们如何跨环境和技术减少风险的最新信息

edge icon

边缘计算

简化边缘运维的平台更新

Infrastructure icon

基础架构

全球领先企业 Linux 平台的最新动态

application development icon

应用领域

我们针对最严峻的应用挑战的解决方案

Original series icon

原创节目

关于企业技术领域的创客和领导者们有趣的故事