The Top 3 Java Performance Improvements we're looking forward to

Wed 07 February 2018 Richard Warburton & Sadiq Jaffer

Introduction

In this blog you’ll get a sneak peak of some of the upcoming changes in future Java SE releases that we’re most excited about here at Opsian. One of the great things about running code on the JVM is that you often have a pretty good idea that taking the same code and running it on the latest JVM will give you a free performance boost. If not, we’ve got some tooling that will help.

Value Types

Modern CPUs execute and retire instructions incredibly fast, and CPU problems are often bottlenecked on retrieving data from main memory rather than algorithmic issues. For some perspective, retrieving data from main memory takes 200-400 times longer than adding two integers. In order to avoid your CPUs spending all their time waiting on main memory access, modern CPUs rely heavily on multi-tiered caches.

In order to speed things up, CPU caches eagerly prefetch data from main memory. They observe access patterns where you’re striding linearly through memory addresses, for example looping over an int[], and start loading the next address to be accessed before your CPU instruction needs access to it. If you’re interested in learning more CPU caches one of the Opsian team members has even given a talk about it.

Unfortunately, this prefetching interacts poorly with the way Java works. One of the biggest performance limitations of the Java platform at the moment is the lack of control around memory layout of objects. In Java if you have an object with a few other objects in fields then in practice the fields will be represented using a reference to the other object: a bit like a pointer, but you can’t perform any arithmetic on it. The problem with this is the performance impact of pointer chasing. If you’re constantly chasing pointers around your JVM Heap your CPU Cache will have little chance of predicting access patterns and even less chance of prefetching down the overhead of reading data from main memory.

Value Types are a proposed solution to this problem. They are aggregate types, like a class, but don’t have a reference identity, or as the project mantra says - Codes like a class, works like an int. By not having a reference identity they can be inlined into objects and arrays, reducing pointer indirection. This project is also interesting because the impact that value types have with generics and contains a proposal to also enable primitive specialisation in generics: a List<int> rather than List<Integer>.

Low Pause Garbage Collectors (Shenandoah and ZGC)

Garbage Collectors undoubtedly have brought great advancements in programming productivity over the years, but even today some applications still have performance trouble caused by GC. To the extent that some JVM vendors even focus their products around low pause Garbage Collectors.

Unfortunately many of us still use the Oracle or OpenJDK based JVMs. Despite recent advances in the performance of G1 - the default collector since JDK 9 - these collectors still have many scenarios where they introduce long pauses. Two new GCs under development - Shenandoah and ZGC - both aim to fix this problem. Both GCs are under active development at the moment and are being developed in the open source OpenJDK project.

Shenandoah is an effort being led by Red Hat. Whilst both the existing CMS and G1 collectors detect which objects are live concurrently with a running application neither compact the objects within the heap concurrently. The result is that eventually they need to pause the running application in order to perform compaction. Shenandoah is targeted at achieving low pause times (< 10 ms) on very large heaps.

ZGC is an effort being developed internally at Oracle. It has very similar goals in terms of achieving low pauses on large heaps.

NIO Improvements

One of the most common things that production server side Java code does is perform network IO. And it’s often an application performance bottleneck. Nearly every network service uses Java’s built in in NIO libraries which are frankly not up to the standard that you can achieve with native code. Thought some libraries like Netty have already started to use Native code in order to bind to native libraries where available and improve performance and security. Even though very few developers directly use them

A while back some community members, myself included, proposed a set of improvements to the NIO implementation. Thankfully the JDK team have taken up this feedback and started to investigate it. The interesting thing about NIO improvements is their reach - very few people use NIO directly but the overwhelming majority of Java applications use it indirectly. For example they may use a framework like Spring or a Servlet container in production that relies on NIO for its networking capabilities. So whilst it may not seem like a hugely important area of improvement it actually affects many software developers.

NIO isn't as exciting in some ways as the changes in terms of Value Types or GC. In practical software development often IO bottlenecks come up as a performance problem pretty regularly so it's actually a very important area that the JDK can improve in. My only caveat to these optimistic words around NIO improvements is whether this area of the JDK will ever be resourced as much as it needs to be compared to big and exciting projects like Value types or writing a new GC.

Conclusions

There are lots of great changes coming up in future Java and JVM versions. These should reduce the overhead of garbage collection, improve memory layout related performance problems and help IO. The fact that these improvements are coming all over the Java stack is a positive sign and means that a fairly wide range of applications and developers should benefit from these different changes.

Introduction

Value Types

Low Pause Garbage Collectors (Shenandoah and ZGC)

NIO Improvements

Conclusions

Related articles

Opsian talks to Aleksey Shipilëv about Shenandoah and Concurrent GCs

Opsian talks to Aleksey Shipilëv about JDK updates

Can instanceof make Unmodifiable Collections faster?

What is Continuous Profiling?