Will GPUs Become Mainstream? December 4, 2009Posted by Peter Varhol in Architectures, Software platforms, Software tools.
I’ve been closely following what Nvidia is doing to make the graphics processing unit (GPU) accessible to developers of general-purpose applications. In case you’re not familiar with it, Nvidia, and to a lesser extent AMD with its ATI product lines, have been designing and producing faster and faster graphics chips. Any serious gamer, illustrator, or design engineer knows which graphics cards are best for their applications.
But a funny thing happened in the drive to build better and faster graphics processors. These same processors became good at other types of processing, including to some extent general-purpose processing – that is, the ability to execute any application built for them.
But where they really excel is in the mathematics of graphics, called floating point processing. This also extends to any computation involving numbers, so any computation-intensive application can benefit. A 1U Nvidia unit with a quad processor configuration can do 4 TeraFLOPS of single precision operations, and about 340 GigaFLOPS of double precision. The highly parallel Tesla systems introduced over a year ago. The high-end configuration incorporated 960 processors, and is priced at just under $10,000. The system is rated at 36 TeraFLOPS single precision, making it theoretically possible to solve all but the most computationally intensive problems.
The new Fermi architecture, which should be available early next year, supports up to 512 GPUs in parallel, using the company’s CUDA (Compute Unified Device Architecture) parallel computing architecture. Many systems using GPUs and CUDA have a single industry-standard processor, usually running Windows or Linux. An application written for a GPU typically has a front end running on one of these operating systems. When a computation is required, the relevant data is passed off to executable code loaded onto the GPUs. When execution is complete, the results are returned to the CPU and displayed.
Execution of floating point code is much, much faster than the so-called industry standard processors from Intel and AMD. But a couple of caveats are in order. First, of course code built for Intel processors won’t run on a GPU. And even for C code, it’s not just a straight recompile; today’s Nvidia GPUs don’t support function pointers, for example.
Second, writing parallel applications, or modifying existing applications to take advantage of parallel processing, is enormously difficult. While new techniques are being developed, and developers are acquiring new skills, this will remain the biggest obstacle to taking full advantage of GPUs.
Nvidia has an intriguing tool called Nexus that should go a long ways toward helping software developers to trace and debug application code from the CPU running on Windows into the GPU, including parallel applications on the GPU, and back to the CPU.
A decade or so ago several systems companies had their own processor designs – IBM with PowerPC, DEC with Alpha, HP with PA-RISC, and Sun with SPARC (to be fair, there are still more 32-bit processor architectures today, such as ARM and MIPS, but these are mostly targeted toward smartphone and embedded uses). Today, as it became enormously expensive to design and build faster and more complex processors, the vast majority of general-purpose computers use Intel and Intel-compatible X86 processors.
So it’s not a given that GPUs, despite significant performance advantages over industry-standard processors, will be widely adopted. Still, computing trends often occur in cycles where a better approach to models used in the past can be successful, so I wouldn’t write them off just yet. And I am excited about the prospect for new innovation in the use of processors that help applications run faster.
For more information on GPUs for general-purpose computing, visit GPGPU, a news aggregation site on the topic.