There is still a common belief about Java being slow (especially during the startup) and memory-consuming, making it not being the first option for ephemeral service instances, like containers. This blog post pretends to put some light on those assertions, quantifying the impact of a last-generation JVM in a simple, single-threaded, application. We compare the impact of the JVM measuring the execution time and memory spent of a Java QuickSort implementation, comparing it with the execution of a native image generated with the modern GraalVM Ahead-Of-Time (AOT) compiler and the same QuickSort implementation in Go.
Base setup
- Mac OS X
- 4-Core Intel Core i7-7820HQ CPU at 2.90GHz
- 16GB RAM
- Java 11, GraalVM 19.03
- QuickSort implementation, for both Just-In-Time (JIT) and AOT compilation.
- Go 1.13.4
Tests setup
Each test consists of a program that:
- Allocates an N-sized array and populates it with random data.
- Sorts the array using a basic QuickSort algorithm.
- Checks that the array is correctly sorted.
The tests are repeated for array sizes from 10 to 1_000_000_000 32-bit integers.
The QuickSort program has three implementations:
- JIT: Java bytecode running into a GraalVM implementation.
- AOT: Native executable, compiled with the GraalVM
native-image
tool. - Go: Native executable, compiled with Go 1.13.4, to get a reference from another compiled, memory-managed language.
Test results
Execution time
The image below shows, in logarithmic scale, the execution times (in milliseconds) of the different implementations for the different array sizes:
Remarks:
-
The scenarios where the array is really small (less than 10_000 elements) allow us to quantify the impact of the JVM spinup time: between 80 and 90 milliseconds.
-
For scenarios that require more than 10 seconds, the JIT compiler is able to optimize the executable and finishes faster than any other implementation: around 9% faster than AOT, around 2% faster than Go.
Resident Set Size (RSS) memory
The maximum RSS has been measured for the executions of the previous section:
Remarks:
- The impact of the JVM in terms of memory is important in very small workloads: from 25MB to almost 70MB in the scenario where the array size is >= 100_000_000 integers.
- For memory-intensive scenarios (100_000_000 integers == 400MB memory), the impact in memory is negligible, being the Java AOT implementation slightly less memory-consuming even than Go.
Other aspects
Compile time:
- Graal Bytecode: 1.4 seconds
- Graal Native: 39 seconds (yes, 39 seconds)
- Golang: 0.6 seconds
Executable size:
- Graal Bytecode: 1.5KB (but we should add the weight of a complete JVM).
- Graal Native: 6.5MB
- Golang: 2.1MB
Conclusions
This post shows a dummy experiment with a single-thread, memory-intensive application. We put light on some aspects, and the conclusions here may be an aspect to consider in future decisions (Java or not Java? JIT or AOT?) but definitely, you should do your own benchmarks for your workloads: CPU or I/O intensive? Short-lived or long-lived? Few or many instances of your workload?
The above conclusions are open for discussion.
Java JIT vs AOT
-
Startup time should not be the driver for choosing one or another configuration.
- Against the common belief, the JVM startup time should be negligible: 80-90 milliseconds, which could be probably improved through tuning or proper usage of the Java Modules system.
-
Execution speed should not be the driver for choosing one or another configuration.
- The optimization capabilities of a JIT compiler makes the executable running faster than in any other implementation.
- The optimization time and improvement would vary in larger code-bases.
-
Memory usage and packaging size MAY BE a driver for choosing one or another configuration.
- I would say those aspects are negligible in traditional scenarios, where each host or VM runs a few service instances for a long time.
- Java JIT implementation may not be the best option for scenarios with hundreds of containers running small workloads or simple processes, as the extra memory and disk size would be multiplied per each instance.
-
Java AOT comes at a price.
- Huge compilation times (may slow-down your CI/CD pipeline).
- Very limited reflection API, limiting the usage of several frameworks or requiring of extra, complex, configuration: JPA, Spring...
- This aspect will be treated in a future blog post.
GraalVM AOT vs Go
Go is a very mature language and comes with a mature compiler and toolset. It beats GraalVM AOT compiler in terms of compilation time, execution time and executable size.
The GraalVM native-image generator has still room for improvement, but their current results look promising. In future evaluations, we should see how GraalVM executables work with debuggers, profilers, and other tools.