Concatenating many strings

new StringBuffer().append(String)
new StringBuffer(12888897).append(char)

In these tests, we continiously append a int to a string to get a very long string. The code below shows the implementation with a StringBuffer, initialized with a String.

StringBuffer sentence = new StringBuffer("Sentence");
while (--i != 0) {
    sentence = sentence.append(i);
}
sentence.toString();

This code concatenates numbers to a StringBuffer to form a very long string. In our case, we are concatenating 2 million numbers to form a string of approximately 13 million characters.

The red bar in the graph shows the speedup when the StringBuffer is initialized with the total length of the String. Normally, the StringBuffer gets an initial length of the length of the constructor, plus 16. In our case, this is the length of "Sentence", which is 8, plus 16 is 24. If you append to the StringBuffer so that it becomes bigger than 24 characters, it allocates a bigger chunk of memory. It calculates the new size of the buffer, which is the double of the old size plus one. In our case, this is 2 * (24 + 1) = 50. After it has allocated this memory, it copies the string from the old chunk to the new array. Then, when we append to the StringBuffer so that it becomes larger than 50 characters, it has to do this again. Before we have a string of 12 million characters, this happened 19 times and 6,815,742 characters have been copied. Instead, when we tell the StringBuffer class upon construction that we want to make a string of 12,888,897 characters in length, it initializes its buffer to this length. This means it never has to be expanded and no copying is done.

Of course, this test is also possible using a String. However, it is rather slow. So slow, that it would not fit in the graph. String objects are immutable, which means that appending a String to a String can not happen in place. Instead, when string one is appended to string two, a new String object is created and the contents of both string one and string two are copied into it.

So lets see what this means for the test described above. The code would be like this:

String sentence = "Sentence";
while (--i != 0) {
    sentence += i;
}
What happens in line 3 is that the old value of sentence with i appended is copied to a new String, and sentence obtains the value of the new String. Since we are making a String of approximately 13 million characters, that is a lot of copying. Lets assume that the string we are appending has an average length of 6 and we take the initial length of the string as 0, to make things easy. The first pass, no bytes would be copied. The second pass, 6 bytes, the third 12, etc. So how many bytes are copied?

0×6 + 1×6 + 2×6 + 3×6 + ... + 1,999,998×6 + 1,999,999×6
= 6 * (0 + 1 + 2 + 3 + ... + 1,999,998 + 1,999,999)
≅ 6 * (2,000,000×1,000,000)
= ± 12 trillion

No wonder it takes a long time.