Sun Directory compresses data for better performance !

Sun Directory Server Enterprise Edition 7.0 was released last November, and in the December timeframe Brad Diggs and Wajih Ahmed, both Principal Field Technologists and big experts in Directory Services, backed with engineers from the Directory engineering team and Mr Benchmark, put the product on the test bench to evaluate its performance and scalability with Sun new hardware and especially the new F-20 PCIe flash drives (see also what Mr Benchmark says about the F-20).

Brad’s first article describes how much Directory Server 7 entry compression rocks, "extending search performance by more than 50% through increased caching potential". Brad provides details of his findings and gives the commands to run to get the benefits of DSEE 7 in your deployment.

The entry compression feature is also available in the technology that will power future versions of Sun Directory Enterprise Edition: the OpenDS project. In OpenDS, there are 2 options to reduce the size of entries stored in the database. The first one is called entry compaction, and it’s enabled by default. The entry compaction feature removes all references to attribute names and replace them with small identifiers. The second option is actually entry compression which will use the popular ZLib algorithm. This option is not activated by default, but it’s just a command away :

<OPENDS_HOME>/bin/dsconfig -X -p 4444 -h localhost -D cn=Directory\ manager\
 -w password -n set-backend-prop \

 –backend-name userRoot –set entries-compressed:true

Below is the dsconfig usage for disabling entry compaction with OpenDS:

<OPENDS_HOME>/bin/dsconfig -X -p 4444 -h localhost -D cn=directory\ manager\
 -w password -n set-backend-prop \

 –backend-name backend –set compact-encoding:false

Here’s a table that compares the size of the databases of OpenDS 2.2.0 with no compat encoding, with it (default settings) and with compression enabled. The table compares the size of the entry record within the database as well as the overall size of the database which also includes indexes (default OpenDS settings).

Entry Count LDIF Entry Size Uncompacted Entry Size Compacted Entry Size Compressed Entry Size Uncompacted DB Size Compacted DB Size Compressed DB Size
100K 599 b 645 b 481 b 361 b 178.8 MB 163.20 MB 151.65 MB
-34% - 25% -9.6% - 7.1%
1M 603 b 649 b 485 b 364 b 1,515 MB 1,358 MB 1,243 MB
-34% - 25% -11.5% - 8.5%
10M 607 b 653 b 490 b 363 b 13,973 MB 12,416 MB 11,188 MB
-33% - 26% -12.5% - 9.9%


The percentages are computed from the reference value which is the default i.e. compacted. A negative value means an increased size, a positive one means a reduced size.

The second table compares the import times for the 3 different modes for storing entries, for the 3 sample data files.

Entry Count Uncompacted Compacted Compressed
100K 21 s 21 s 22 s
1.1% - -3.5%
1M 106 s 107 s 112 s
0.5% - -4.9%
10M 1006 s 1009 s 1101 s
0.2% - -8.9%

Note: in this table, negative numbers represent increase in time required to import compared to the default settings.

Enabling compression does result in a smaller disk use with that sample data (fully random values), but does come with a performance penalty at least at import time, less than 10% but the penalty increases with the amount of entries.
If you’ve read Brad’s article on DSEE entry compression, you understand that the smaller the entries in the database, the more can be potentially cached in the Database Cache and the better the overall performances are. So if your entries are quite large, contain values that are strings, you should consider enabling the entry compression with OpenDS.

Changing from the default mode (compacted) to uncompacted mode does not give any real advantage in performance, but does increase the disk space usage, so I do not see the value of changing these settings in OpenDS.

Anyway, the benefits of having compact entries in the database are available today with Sun Directory Server Enterprise Edition 7 and Sun OpenDS Standard Edition 2.2, and are helping customers to reduce the overall cost of ownership of the directory services.

Technorati Tags: , , , ,

About these ads

, ,

  1. #1 by anton on 25 January 2010 - 03:22

    note that entry compaction in OpenDS was specifically targeted
    at improving performance as the main goal of that feature,
    disk space optimization is sort of collateral of that.

  2. #2 by Ludovic Poitou on 25 January 2010 - 04:18

    You cannot decorelate disk space optimization from performance, since reducing disk space occupation does reduce the cost of I/O both writing and reading. But somehow I join you with the idea that OpenDS engineering main goal is to improve performance of LDAP services.

  3. #3 by anton on 25 January 2010 - 05:44

    the primary problem we were trying to solve with compaction were
    entry encoding/decoding [ pure processing, not i/o ] costs. i/o
    and disk space were not of primary concern tho its sure nice to
    see it helps with that too but i digress. the reason i’m posting
    these comments is to help your readers understand better how
    compaction feature came about and its difference to compression feature because unlike compression compaction actually improves
    entry processing performance so you do not end up sacrificing
    performance for gaining smaller disk footprint in this case. the
    gain would vary depending on the entry content/structure as well.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,249 other followers

%d bloggers like this: