Avoiding Java StringBuffer Pools

Often, when tuning performance in many applications, pools of objects offer significant speed gains, particularly when the objects are small, used often and have short lifetimes, or when the cost of setting up the objects is high. For example, in a Macintosh application that makes significant use of graphic regions, a pool of preallocated region handles will speed your code up tremendously.

With that in mind, DivNull Software investigated the possibility of pooling a specific type of small, often used objects with short lifetimes in a Java application: StringBuffer.

Contrary to expectations, this experiment determined that pooling StringBuffers in Java application actually reduces performance to a significant degree. What follows describes the experiment and its findings.

Approach

There are no doubt many ways to conceive of a StringBuffer pool, but the most logical way seemed to be having a number of "buckets" of StringBuffers of particular, standard sizes. Generally, a programmer will ask for a buffer of a specific size, and the buffer will stay that size for the duration of its use. The StringBuffer has the ability to grow as it is being used (though smart programmers avoid this, as it drags down performance), so it is possible that a StringBuffer taken from one of these buckets might be returned to a different bucket if it had grown in the interim.

The smallest bucket size was arbitrarily chosen to be 32 characters (which is either 32 or 64 bytes, depending on if your character encoding). Additional buckets would be the successive powers of 2 from the initial size (i.e. 64, 128, etc.) until reaching a bucket size of 4096 characters. Any buffers of a size larger than this would be placed in their own buckets.

As time of development was an issue during the experiment, open source code was used to provide the mechanics of the pool rather than develop it from scratch. In particular, the commons-pool code from Jakarta was used.

None of the existing pool types supplied by commons-pool exactly matched what was needed, but the StackKeyedObjectPool came fairly close, so was chosen as a base for the pool, using the size of the bucket as the key.

As a secondary objective, the comparitive speeds of various methods of concatinating strings were also investigated.

Implementation

Four Java classes were created for the experiment. Implementation was fairly straightforward, but one area caused more concern than expected. All of Java's collection classes store only objects (e.g. Integer) and cannot store basic types (e.g. int), including keys. Care needed to be taken, therefore, to avoid having to allocate Integer objects to represent the keys, since the whole point of the pool was to avoid such small allocations. The "bucket" metaphor allowed this to be resolved, but getting the pool to work while avoiding extra Integer allocations proved slightly trickier than expected. Remember, these classes are basically not usefull, since they slow things down rather than speed them up:

StringBufferPool

A subclass of StackKeyedObjectPool, instantiating this class provides an instance of the string pool. This class also defines some public static members to act as the predefined key sizes as well as methods to get StringBuffers of those specific sizes from the pool. These methods (rather than borrowObject) are intended to be the main interface to the pool. This object also contains a method to return a borrowed buffer that is typed specifically for StringBuffer, confining the need for type coercion to inside the class.

The borrowing process will return a string from the smallest bucket that is equal to or larger than the buffer size requested. For example, the pool includes 32 and 64 buckets, so if you asked for a 40 character buffer, you'd get one from the size 64 bucket. Thus, you get a buffer at least as big as the one you asked for, and possibly bigger. If a request is made for a string larger than the largest standard bucket (4k), a new bucket is created just for that size.

Most of the code in the class deals with returning a StringBuffer to the pool, since it can be returned larger than it was when borrowed. In the most common case (where the size remains exactly equal to one of the bucket sizes), this process is very quick (a binary search, effectively). If the buffer is larger than the standard set, the pool is searched for keys of that size and the key is used if found. If not, a new key is allocated.

StringBufferPoolableObjectFactory

A subclass of BaseKeyedPoolableObjectFactory , this class takes care of allocating the StringBuffers and clearing their contents (without altering their size) when they are returned.

SBPool

Following the singleton pattern, this is a convenience class that is intended to be the primary access to the string pool in most applications (as most apps will only need one string pool). In addition to acting as a single access point, it also handles exceptions.

StringBufferSpeedTest

A JUnit test case that runs a speed test on various string code. Six techniques for building a string are used in the test. A timer is started for each technique, which is repeated 10,000 times and the combined time reported. In addition to testing the pool speed, the effective performance of each technique was also studied.

"String Addition": Uses only String objects, where a String is built from smaller strings, using a pattern like str = a+b+c+d.
"String Naïve": Uses only String objects, where a String is built from smaller strings, using repeated calls that look like str = str + a.
"String.concat": Uses only String objects, where a String is built from smaller strings, using repeated calls that look like str.concat(a).
"Unsized Buffer": Allocates a StringBuffer using new without passing a size parameter. Builds string using repeated calls to buf.append() followed by buf.toString(). Since this will resize the buffer on each append call, this should be significantly slower than the others.
"StringBuffer": Allocates a StringBuffer using new passing a size large enough for the buffer to avoid growth. String built with repeated calls to buf.append() followed by buf.toString().
"Pooled Buffer": Borrows a StringBuffer from the pool, requesting a size large enough for the buffer to avoid growth. String built with repeated calls to buf.append() followed by buf.toString(), and then returning the string to the pool. With the pool in place, only one StringBuffer object is ever allocated.

Results

When StringBufferSpeedTest is run on a 1GHz Dell OptiPlex GX200 under Win2000 and Sun's JDK 1.4.1, the results are as follows:

Test	Execution Times (ms)
Test	Test 1	Test 2	Test 3
String Addition	80	80	70
String Naïve	220	220	221
String.concat	61	50	60
Unsized Buffer	80	71	70
StringBuffer	60	60	60
Pooled Buffer	170	180	180

When StringBufferSpeedTest is run on a 400MHz G3 under Mac OS X.2 and Apple's JDK 1.3.1, the results are as follows:

Test	Execution Times (ms)
Test	Test 1	Test 2	Test 3
String Addition	272	464	573
String Naïve	1000	989	1182
String.concat	458	382	357
Unsized Buffer	285	414	390
StringBuffer	290	197	166
Pooled Buffer	1711	1096	1305

Conclusions

The primary conclusion is that memory allocation is now so fast that pooling objects is not necessary. In fact, the overhead of the pool actually takes longer than allocating memory. This is not typical to most languages.

It is probable that the pool code could be made faster (perhaps using a mechanism other than buckets). Even so, it would need to be around three to eight times faster to even equal the performance of the code it is supposed to be speeding up.

Another surprise is the speed of String.concat in the JDK 1.4.1 case. If you Google for tips on improving Java performance, nearly all of them suggest that the technique used in the "StringBuffer" test should be the fastest and that using only String objects is a performance no-no. If this used to be true, it clearly isn't any more, as String.concat seems to be highly optimized.

This does not seem to be the case with the "String Naïve" technique, which is significantly slower than expected. Clearly this is technique to be completely avoided. (Though, it seems like a compiler could be optimized to detect the pattern used in this technique and replace it with a concat call.) Conversely, the "Unsized Buffer" technique is faster than expected. Again, based on performance tips on the internet, this technique has the reputation of being quite slow.

This test also illustrates that Apple still has pretty far to go with the performance levels of their Java virtual machine.

Back to Wordman's Writing page.