About OpenDJ and Hotspot JVM G1

Duke on a bike

curtesy of Charly Hunt

Understanding and tuning the JVM is quite important to get the best performances out of OpenDJ. We do provide some high level guidance in our documentation and I’ve been talking about Java performances in the last few years at various Java User Groups in France and Switzerland (you can find presentations in French here or here) as well as at a major conference in Brazil : FISL in 2009. On this later occasion, I was asked to cover the presentation for 2 prestigious names in the Sun Hotspot JVM team : Charly Hunt and Tony Printezis. I’ve spent a few hours with them and have learnt a great deal about the internals of the Hotspot JVM and memory management, and all magic parameters, in order to deliver that presentation. At that time, our directory team was interacting a lot with the Hotspot team as we were testing a new and promising garbage collector: Garbage First aka G1. OpenDS was even wrapped and used in one of the largest collection of tests for the Sun JVM.

During the acquisition of Sun by Oracle, the future of G1 and the Hotspot JVM were unsure and our interactions with the Hotspot team diminished seriously.

At ForgeRock, we continued to pay attention to Garbage First and for a long time, we noticed that it wasn’t moving along. Most of the issues that were raised after tests with OpenDS and that were addressed in some development version of the JVM were not integrated in official JVM releases. It only with the Oracle JVM 1.7 update 2 that we noticed the large list of issues fixed with G1. We’ve then resumed testing OpenDJ with G1 to see that while the promise of no full GC seems to be addressed, the performance impact of G1 is still significantly high. With our limited tests of JVM under 4GB of heap size, we noticed a 10% performance degradation over CMS, corresponding with an approximate 10% increase of CPU load (on a quad core machine with hyperthreading on), but with better overall response times for OpenDJ as the maximum response time decreased from 200ms to 80ms, as illustrated below.

LDAP Modrate with Garbage First
-------------------------------------------------------------------------------
 Throughput     Response Time 
 (ops/second)   (milliseconds) 
recent average  recent average 99.9% 99.99% 99.999% err/sec Entries/Srch
-------------------------------------------------------------------------------
16196.7 16374.1  1.972 1.951  18.886 28.129 66.933  0.0
16468.8 16374.9  1.941 1.951  18.883 28.087 66.521  0.0

LDAP Modrate with CMS
-------------------------------------------------------------------------------
 Throughput     Response Time 
 (ops/second)   (milliseconds) 
recent average  recent average 99.9% 99.99% 99.999% err/sec Entries/Srch
-------------------------------------------------------------------------------
17937.1 17487.7  1.780 1.827  18.175 30.521 116.990 0.0
17783.7 17494.3  1.796 1.826  18.145 30.320 117.017 0.0

We need to run more tests with OpenDJ and G1, especially with very large heaps (from 4 to 32GB), but we’re not sure whether G1 will be able to deliver the performances it promised.

And today I noticed on LinkedIn that both Charly Hunt and Tony Printezis, the 2 main engineers behind the HotSpot JVM and Garbage First, had left Oracle for new adventures. Charly’s gone to  SalesForce and Tony to Adobe. This is certainly a good move for both of them, but it leaves me worried about the future of the Hotspot JVM and its ability to deliver innovation in GCs.

[Update on May 6th]

It appears that more engineers of the Sun JVM team have actually left in the last couple of months : John Pampuch, Igor Veresov, Paul Hohensee..

About these ads

, , , , , , ,

  1. #1 by arnaud on 02 May 2012 - 16:05

    Another nice post and quite a vast topic!
    Yes G1 has significant impact. The question you want to answer is wether a customer can tolerate one instance being stopped for a minute. If it’s yes then you’re fine with CMS, paying the price only when Full GC happens is ok. If on the other hand the answer is no, you will just need to find a solution to avoid hitting stop-the-world pauses and dimension the deployment based on that. G1 might be an answer. Azul Zing 5 is another (you might want to give it whirl, it’s not bad, this). At UnboundID, we have our own built-in solution to address the issue.
    The important thing is addressing customers’ needs and in some cases, you will have to edge risks and purchase insurance in one form or another. That is the price to pay.
    Just like you purchase insurance for any other activity -travel, investments, industrial assets, etc…- you will have to pay G1 a 10% fee on performance, Is that too high? maybe, maybe not. It all depends on what’s at stake.

    In many cases though, G1 probably isn’t mature enough to provide enough of a guarantee that the price is worth it but maybe I’m wrong on this and it has come of age.

    • #2 by Ludo on 02 May 2012 - 16:08

      I agree with you that G1 is not mature enough. And it worries me that the people that worked on it are gone.

      • #3 by Nicholas Sushkin (@nsushkin) on 02 May 2012 - 18:30

        We know that Oracle is merging JRockit into the standard JVM as a commercial tier. They probably want to give all the new GC work to the JRockit team.

      • #4 by arnaud on 03 May 2012 - 00:04

        same here, dude, same here…

  2. #5 by Matthew Swift on 18 May 2012 - 10:06

    I don’t know the true motives for Tony et al moving from Oracle. I’m sure that they would not have left if they were still under Sun. I can’t help but wonder if the continual overly aggressive and hostile attitude to customers and competition has a hugely negative impact on many of its engineers. The current spat with Google will only serve to lose more respect from ex-Sun engineers (I think we speak from personal experience here). Does Larry Ellison realize that by “protecting” Sun’s jewels he is in fact destroying them?

    While the rest of us benefit when Oracle scares away its own customers, no one benefits from this. We all depend on Java to be a solid ecosystem with a bright future. What a shame.

  1. JVM 性能优化, Part 5 — 伸缩性 | caoxudong's workstation
  2. JVM性能优化, Part 5:Java的伸缩性 | Multiprocess

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,230 other followers

%d bloggers like this: