Musings on a Good Parallel Computer

Until the late 1990s, the concept of a 3D accelerator card was something generally associated with high-end workstations. Video games and kin would run happily on the CPU in one’s …read more

Mar 23, 2025 - 09:24
 0
Musings on a Good Parallel Computer

Until the late 1990s, the concept of a 3D accelerator card was something generally associated with high-end workstations. Video games and kin would run happily on the CPU in one’s desktop system, with later extensions like MMX, 3DNow!, SSE, etc. providing a significant performance boost for games that supported it. As 3D accelerator cards (colloquially called graphics processing unit, or GPU) became prevalent, they took over almost all SIMD vector tasks, but one thing which they’re not good at is being a general parallel computer. While working on a software project this really ticked [Raph Levien] off and inspired him to cover his grievances.

Although the interaction between CPUs and GPUs has become tighter over the decades, with PCIe in particular being a big improvement over AGP & PCI, GPUs are still terrible at running arbitrary computing tasks and PCIe links are still glacial compared to communication within the GPU & CPU dies. With the introduction of asynchronous graphic APIs this divide became even more intense. The proposal thus is to invert this relationship.

There’s precedent for this already, with Intel’s Larrabee and IBM’s Cell processor merging CPU and GPU characteristics on a single die, though both struggled with developing for such a new kind of architecture. Sony’s PlayStation 3 was forced to add a GPU due to these issues. There is also the DirectStorage API in DirectX which bypasses the CPU when loading assets from storage, effectively adding CPU features to GPUs.

As [Raph] notes, so-called AI accelerators also have these characteristics, with often multiple SIMD-capable, CPU-like cores. Maybe the future is Cell after all.