Thursday, August 19, 2010

Performance of Software

I was reading a little about scala these days and think that it's a cool language that would probably make my life a lot easier. For example, the CardCompiler is hard to use; you have to go through a few hoops to implement new effect parsers. I think that I am on the right way, but there is much room for improvement in terms of usability.

But today I don't want to talk about Scala, but software performance. One point with Scala is that it largely uses immutable objects, putting copying over mutating existing objects. The first thought when you hear this is probably how much memory this would waste and that it will also need more processing power all the data has to be copied.

However, performance doesn't mean to use as little resources as possible. It means to use the resources you have the best possible. And nowadays, the resource we have but use the least are multiple processors.

Modern processors work on the edge of physical possibilities. Circuits are not thicker than a few atoms, and currencies are so small that you can almost measure the number of electrons flowing. At this scale, quantum effects matter, and this means randomness. That's about the reason why (semiconductor) processors don't get faster any more and increasing speed does only occur by parralelism.

Writing software that works parallel, however, is hard, because multiple threads can access the same resource at the same time and thus wreak havoc. Which leads back to immutable objects: Data corruption can only occur when writing. when every access is read-only or creates a copy, nothing bad can happen.

This leaves one question: How bad is copying really? Well, every copy you don't need any more can be garbage collected, so memory should not really be a concern. Memory size is also not as restricted as processor speed, so there should be enough potential to support memory-heavy rather than processor-heavy software.
Garbage collection itself, however, uses the processor. It finds objects that are no longer accessed by any thread; that means that it is not constrained to work in the program's normal flow and can work in its own thread. As the garbage collector is not really interested in the objects, it can even be spread to multiple threads if necessary.

Copying itself uses processing power, and can take some time for large objects. Usually, the only type of large objects are arrays. There are really not many possibilities: boolean, byte, short, char, int, float, long, double, Object. All these take up to 8 bytes to store. All that stays are the indexes of arrays, which can go into several millions. If we stay with immutable objects, this is really the only bottleneck we have to deal with, because you don't need to copy the array's content objects, as they can't change.

I guess this wasn't all that interesting to read, but I really wanted to speak this out: parallelism is the future of software, parallelism is hard, so any attempt to make it easier, for example by using immutable objects, is a step into the right direction

No comments: