We Are Bending the Heck Out of the Von Neumann Bottleneck December 1, 2012Posted by Peter Varhol in Architectures, Software platforms.
When I was taking graduate computer science classes, back in the late 1980s, we spent some time talking about SIMD and MIMD (Single Instruction Multiple Data and Multiple Instruction Multiple Data) computer architectures, with the inevitable caveat that all of this was theoretical, because of the Von Neumann Bottleneck. John Von Neumann, as students of computer science know, was a renowned mathematician who made contributions across a wide range of fields. In a nutshell, the Von Neumann Bottleneck defined processor architecture in such a way that the bandwidth between the CPU and memory is very small in comparison with the amount of memory and storage available and ready for CPU use.
I’ve recently returned from Supercomputing 2012, and I’m pleased to say that while we are not breaking the Von Neumann Bottleneck, new computing architectures are bending the heck out of it. You can argue that the principle of parallel processing addresses the bottleneck, and parallel processing is so mainstream in the world of supercomputing that it barely rates a mention.
Programmers are well aware that writing parallel code is difficult and error-prone; we simply don’t naturally think in the sort of way that provides us with parallel ways to solve problems. But with multiple processors and cores, we end up with more busses between memory and processor (although it’s certainly not a one-to-one relationship).
Because writing parallel code is so difficult, there are a growing number of tools that claim to provide an easy(ier) path to building parallelism into applications. One of the most interesting is Advanced Cluster Systems. It provides a software solution called SET that enables vendors and engineering groups with proprietary source code to easily parallelize that code. In some cases, if the application is constructed appropriately, source code may not even be required.
In addition to parallel processing, we can look to other places for moving more data, and more quickly, into the processors. One place is flash storage, which becomes virtual memory for an application, with only the working set loaded into main memory. FusionIO offered a partial solution to that bottleneck with a flash memory storage device that was software-configured to act as either storage or an extension of main memory, with separate busses into the processor space. The advantage here is that program instructions and data can be stored on these flash memory devices, which then have direct access to main memory and processor space. The single bus isn’t such a bottleneck any more.
All of this doesn’t mean that we’ve moved beyond the Von Neumann architecture and corresponding bottleneck. But it does mean that we’ve found some interesting workarounds that help get the most out of today’s processors. And as fast as we think computers are today, they will be far faster in the future. We can only imagine how that will change our lives.