Saturday, April 27, 2013

Moore's Law & Parallelization

Moore's Law is the observation that the number of transistors doubles every two years. Unfortunately, Moore's Law is getting closer and closer to a physical limit as the size of transistors are becoming the size of molecules. My education in computer engineering has made a lot about computers less mysterious, but building transistors the size of molecules let alone fitting billions of them on the size of quarter still blows my mind.

To combat this inevitability, you may have noticed processors consisting of multiple cores. This is the heart of parallelization. The problem with parallelization is transforming software to best optimize this new hardware. It seems simple enough to figure out what processes can run simultaneously, but the issue arises when these processes need to access the same resources. There are different design possibilities, but I'll just give a simple example.

Imagine a bank account that you share with someone you trust. You both are trying to withdraw some money for some purpose. There is currently $100 in the account. You need $50, and your friend needs $60. Suppose you access the ATM at roughly the same time. You and your friend see the account balance at $100, so you decide to withdraw the amounts you need. Unfortunately, your friend is just a bit faster and now you've overdrawn the account and are charged a fee as a result. If you had known that your friend was planning to withdraw $60, you wouldn't have tried to withdraw $50, so you wouldn't have had an overdraw fee. Now this example is not trivial. It at least was prevalent when self service banking became more of a thing. With the advent of parallel processors change bank to memory and imagine one process with millions like it, and you quickly see why this is a problem.

There are a variety of design choices to consider in solving this problem of parallelization, but the one most common that I've seen in addressing this issue is the concept of a lock and key. I have actually seen this terminology with share drives on Microsoft systems. A shared file will actually be considered locked, and changes can only be made by the user who opened it first.

In code, the terms 'lock' and 'key' can be used as syntax to reserve a block of memory for a parallel process to use. Now this solves the problem of sharing, but the challenge to programmers is optimizing the lock and key method while also making their programs be able to work in parallel as much as possible. Typically, you want to keep a block of memory open as much as possible and only lock it when it is being written to. Thus, if you are only reading memory, then everyone can read the memory.

Parallelization may not help a computer execute a line of code faster, but if that line of code can be split up into multiple lines of code, then Moore's Law in terms of raw processing power can be maintained for the foreseeable future even when transistors reach the size of atoms. If anyone is serious about programming, this will be a skill that you will have to learn.