Java on Docker will no longer suck: improvements coming in Java 10
Docker has been a really popular technology over the last few years and it’s easy to see why. Containerising JVM-based applications offers consistent environments for development and deployment and proper isolation between applications when deployed. Unfortunately the JVM doesn’t currently make running inside of a Linux container the easiest of affairs. Java 9 & 10 finally bring a host of much needed improvements, here are our top three.
By default, on 64-bit servers, the JVM will set a maximum heap size of up to 1/4th of physical memory. This really isn’t helpful in a containerised environment as you often have a host with significant amounts of memory that could run many JVMs. If you run 10 JVMs in different containers and they each end up using ¼ of the RAM then you overcommit the machines’ RAM and potentially end up hitting swap - causing sad times for your users.
It also nullifies one of the advantages of containerisation, that the container image built and tested will perform the same in production. An image could easily work fine in a staging environment on a smaller physical host but then on a larger host in production could exceed any memory limit for the container and get killed by the kernel.
There are various workarounds for this such as including a JAVA_OPTIONS environment variable that enables setting the heap size (or -XX:MaxRam) from outside the container. This gets messy because it requires duplicating information about container limits - once in the container and once for the JVM. You could also script the JVM startup, extracting the correct memory limits from the proc filesystem.
The primary mechanism for isolating containers on Linux is via Control Groups (cgroups), these allow for (amongst other things) limiting resources to a group of processes. With Java 10 the JVM will now read memory limits and usage from the container’s cgroup and use this to initialise maximum memory, removing the need for any of these workarounds.
By default, docker containers have unlimited access to all CPUs on the system. It is possible and common to restrict utilisation to a certain percentage of CPU time (using CPU shares) or individual ranges of CPUs (using cpusets) from the system.
Unfortunately, as with heap sizing, the JVM in Java 8 was mostly unaware of the various mechanisms used to restrict CPU utilisation inside of containers. This could cause problems on large physical hosts with many cores, as all JVMs running inside containers would assume they had access to far more CPUs than they actually did. A consequence of this was that many parts of the JVM that would adaptively size based on available processors, such as the GCs with parallelism and concurrency, JIT compiler threads and the ForkJoin pools, would be incorrectly sized, spin up more threads than they were supposed to and this could lead to too much context switching and poor performance in production. Many third party utilities, libraries and applications also make use of the Runtime.availableProcessors() method to size their own thread pools and exhibit similar behaviour.
As of Java 8u131 and Java 9, the JVM could understand and utilise cpusets for sizing available processors while Java 10 brings support for CPU shares.
Attach from host
The Attach API allows programmatic access to a JVM, from another JVM. It’s useful for reading the environment state of a target JVM and crucially, dynamically loading in JVM agents which can perform additional monitoring, profiling or diagnostic tasks. It is not currently possible to attach from a JVM on the host machine to a JVM running inside a docker container because of how the attach mechanism interacts with process namespaces.
All processes in on mainstream Operating Systems have a unique identifier, the PID. Linux also has the concept of PID namespaces where two processes in different namespaces can share the same PID. Namespaces can also be nested and this functionality is used to isolate processes inside a container.
The complication for the attach mechanism is that the JVM inside of the container currently has no concept of its PID outside the container. Java 10 fixes this by the JVM inside the container finding its PID in the root namespace and using this to watch for a JVM attachment.
In summary, if you are running the JVM under Docker you really should be looking forward to the release of Java 10 at the end of this month and trying to upgrade to it as soon as possible.