This page details a number of ways you can improve your Java code, particularly its performance. You can find a bunch of these tips on the web as well.
A lot of Java tip books and web pages make a big deal out of using
StringBuffer
and append()
instead of
String
and +
or +=
. For example, this
code is considered bad:
String s1 = "An initial string";
String s2 = "with a concatenation";
String s3 = s1 + " " + s2;
Most of these tip articles say that you will improve performance by changing the code above to this:
StringBuffer s = new StringBuffer();
s.append("An initial string");
s.append(" ");
s.append("with a concatenation");
String s3 = s.toString();
The problem is that the tips that tell you this are out of date. These days,
the compiler will turn the first code snippet into the second for you. That is,
when using a modern compiler, the performance of the above snippets are
identical. That is not to say, however, that you shouldn’t use
StringBuffer
. You should use it for any character stream that
changes after it is constructed.
The main performance problem in the second snippet has little to do with
append, and much to do with StingBuffer
’s internal memory
management. By default, 16 bytes are allocated for a StringBuffer
.
When a string is appended, if there is not enough space, the buffer size is
doubled. This causes a reallocation, copy and de-allocation, which can be very
costly.
In most cases, however, you are likely to know ahead of time how big of a
buffer you will need. You can get huge performance gains by simply asking for
that large of a buffer when constructing the StringBuffer
. For
example, the first line of the second code snippet above should be:
StringBuffer s = new StringBuffer(40);
Reducing extra copies and allocations is why most of the tip books tell you
not to use +
or +=
with Strings
. In the
first code snippet above there is really no problem with using the + operator,
but let’s look at an example where it does matter. The following code is
bad:
// Build a string with nCount * characters
// Don’t do this
String starString = "";
for(int I=1; I < nCount; I++)
starString = starString + "*";
The problem with this code is that the compiler cannot optimize it, so each iteration of the for loop allocates, copies and destroys both a String and StringBuffer object. This code is much better:
String buffer = new StringBuffer( nCount + 1 );
for(int I=1; I < nCount; I++)
buffer.append("*");
String starString = buffer.toString();
This time, no allocation is done at all inside the for loop, and the whole buffer is copied only once at the end. This last code snippet runs about 40 times faster than the previous one.
Unlike many string implementations (say, the std::string
object
in C++), the ==
operator of Java’s String
object
does not actually compare the contents of the strings. Instead it checks to see
if the two strings have the same reference. To compare the contents of the
strings, you use the equals method.
If you are in a situation where it is likely that you might be comparing identical strings, this code will be a bit faster, as the equals call will be skipped if the references are identical:
if (s1 == s2 || s1.equals(s2))
...
This technique illustrates a more general principle of performance –
always perform a cheap test before an expensive one, if you possibly can. In
this example, ==
is much less expensive to perform than equals.
Sun suggests a technique for “interning” strings, to make it more likely that strings with identical content contain the same reference.
Java provides a number of basic classes that do a lot of work for you. Often, these classes are used without considering how they might be implemented and this can cause performance problems. Often, by finding out a small amount about a class, you can code against it in a way that builds much faster code.
This section will use the Vector
object to illustrate the point,
but many classes have exactly the same issues. The Vector
is like
an array, but it can grow and shrink with use, similar to a
StringBuffer
. Like the string buffer, its default allocation is
small and doubled when it needs to be grown. Each doubling causes a copy of the
existing items. This copying is actually a lot worse that that of the
StringBuffer
because generally objects are being copied in this
case, not just characters.
The fix is the same: pre-allocate the size of the Vector
with
your best guess on how many items it might eventually hold. Note that it is
better to overshoot on this guess, getting more memory than you need, especially
if this is a local vector with a short life span.
Vector aList = new Vector(10000);
More subtle are some of the other methods of the Vector
. For
example, looking at the documentation for the remove()
method,
there does not seem like there is much insidious happening there, but knowing a
bit about the implementation of a Vector
uncovers potential
performance lags. One of the implementation choices of the Vector
is that it can never contain any “gaps”. This means that if an item
is removed from the Vector
, all of the elements with higher index
values are shifted down, generating a large number of copy operations. This
means that while removing the last item from a Vector
is very fast,
removing the first item from an n item Vector
generates n-1 copies.
So:
Bad | Better | Best | ||
---|---|---|---|---|
int v_nSize = a_List.size();
|
int nSize = aList.size();
|
aList.removeAllElements(); |
Another problem with objects like Vector
is that they can hide
O(n) operations within methods. Consider this code, for example, noting the
argument passed to the remove()
method:
String s = "Hello";
int v_nIndex = v_List.indexOf(s);
if(v_nIndex != -1)
v_List.remove(s);
Because remove
is passed the string itself, it must find the
string to remove it. You can bet that it does this by calling
indexOf
, so the code above performs a search through the
Vector
twice, though it does not look like that is what is
happening. Either of the following is better:
String s = "Hello";
int v_nIndex = v_List.indexOf(s);
if(v_nIndex != -1)
v_List.remove(v_nIndex);
Note that there are a number of classes similar to Vector
(in
that they collect a bunch of items together), that have different performance
tradeoffs. It is important to understand the difference of these classes, such
as ArrayList
and LinkedList
, so that you can choose
the one that is right for your particular purpose. Generally the differences
deal with cost of insertion, cost of retrieval, and cost of removal. For
example, inserts into a Vector
can be costly (up to O(n), depending
on where the item is inserted) but retrieval is constant time if done by index.
A LinkedList
, on the other hand, has constant time inserts, but
retrieving an item requires a linear (that is, O(n)) search through the list.
More information along these lines is given in docs on Sun’s
Collection Framework.
Avoid using explicit object types if there is an interface available instead.
For example, in the previous section, much of the code used a
Vector
explicitly. It may turn out later that a different
collection type, like LinkedList
, would be more appropriate for the
storage in that code. If that happens, changing the code could be cumbersome,
because you might have to change the code every time the Vector
is
used.
A better approach is to use the interfaces provided in the Collection
Framework. The Vector
is one of a few classes that implement the
List
interface. So, rather than using these kinds of
statements:
void foo( Vector aList );
// Don’t do thisVector list = new Vector(100);
// Don’t do this
…use the List interface, like so:
void foo( List aList );
List list = new Vector(100);
Since list is then defined to a specific interface and all of the code using
list works against that interface, all you would need to do to change the
underlying data store is replace new Vector
with new
LinkedList
or some other class that implements the
List
interface. You could even write your own.
Any time you write a loop, you have a chance of dragging down performance. Usually you do not need to be hyper-vigilant for little performance tweaks, but in loops you should be careful. Consider this code:
Vector v_List = new Vector(10000);
fillVector( v_List );
for(i=1; i < v_List.size(); i++)
doSomething( v_List.get(i) );
This code is sub-optimal, because it calls v_List.size()
on every
iteration through the loop. As a rule, you want to move processing out of the
loop if possible, like so:
Vector v_List = new Vector(10000);
fillVector( v_List );
int v_nSize = v_List.size();
for(i=1; i < v_nSize; i++)
doSomething( v_List.get(i) );
Note that this is not the best example, because the
v_List.size()
operation is likely to be nearly instantaneous
anyway. Still it illustrates the general principle.
As a special case of the previous section, you can gain a surprising amount of performance from eliminating conditional statements of any kind from code, especially those in tight loops. The reason for this is modern processor architecture. Most modern CPU’s use the concept of pipelining. Though explained better elsewhere , the basic idea is that you fill a “pipe” up with instructions, allowing the CPU to partially process all of the instructions in the pipe before the first calculation in the pipe is completed. This works great for serial tasks, but often the result of the first calculation affects the other instructions in the pipe, so sometimes work in the pipe wasted. You can sometimes get “islands” in the pipe that flow through the process, doomed to get dumped out at the end. Even though this happens, CPU designers have found that pipelining works more often than it fails, so on balance gives more speed.
One huge problem pipelines have is conditionals (if statements). To fill the pipe, the CPU essentially has to make a guess as to which branch the conditional will take, long before it can evaluate the condition itself. This is called “branch prediction” and if the CPU guesses incorrectly, the whole pipe may need filled with useless instructions that have to get cleared out before the correct branch can be followed. As a result, if statements can be very bad for performance, especially in tight loops.
Take, for example, a comparison function used in a sort routine, comparing two items, a and b. Usually such functions return zero if a and b are the same, a negative number if a is less than b and a positive number if a is greater than b. You might assume that code to compare a and b if they were integers would look like this:
int compareInt( int a, int b) // Don’t do this
{
if ( a < b )
return –1;
if ( a > b )
return 1;
return 0;}
}
This turns out not be the case. If you look at most ANSI integer (or character) comparisons, the code looks more like this:
int compareInt( int a, int b)
{
return (a - b);
}
This code does much the same thing, but does not use branching at all. The reason for this is that comparison functions get called within tight loops, and if the CPU’s branch prediction fails, the cost is quite high. Eliminating the branches eliminates the need for branch prediction, eliminating stale pipes.
Often, a way can be find to avoid branching, even when comparing things that look like they require branches. For example, the following function compares to floating point numbers without using minimal branching. (This code comes from Chip Coldwell, who claims it sped his sorts by 20%). Note that code in this snippet that appears to be conditional (e.g. r != 0) does not actually create branches but calculates a value.
static inline int func1(double x, double y)
{
double r = x - y;
return (((r > 0) << 1) - 1) & -(r != 0);
}
final
Unless you tell Java otherwise, all methods you define in a class are virtual methods. That is, they can be overridden by a subclass. (Note, C++ does just the opposite, only making virtual methods if a special keyword is used.) If you never override a particular method, having that method be a virtual function is inefficient in two ways. First, it takes extra memory, because the class will have a (hidden) list of all of its virtual functions. Secondly, it slows down performance, because each virtual function in the table must be initialized and a lookup must be done on the table must when the method is called.
Most classes contain a number of methods that don’t need to be virtual methods. For example, data accessor methods rarely need to be overridden.
Fortunately, Java has a special keyword, final, that can be used to make methods non-virtual, eliminating the overhead:
public final void f() {...}
It now appears, however, that neither of these problems is that significant of a performance factor in Java. So, while you might hear people say that using final as much as possible is a good idea, there is not much evidence that doing so is worth the effort.
You can also make an entire class incapable of being subclassed:
public final class A {...}
Back to Wordman's Writing page.