1. Home >
  2. Extreme

Cerebras Unveils 2nd Gen Wafer Scale Engine: 850,000 Cores, 2.6 Trillion Transistors

Cerebras has unveiled its second-generation wafer-scale engine, or WSE-2. The new wafer is etched using 7nm lithography, with 850,000 cores, 2.6 trillion transistors, and 40GB of onboard SRAM.
By Joel Hruska
Cerebras-Feature

Cerebras is back with the second generation of its Wafer Scale Engine. WSE 2.0 -- sadly, the name "Son of Wafer-Scale" appears to have died in committee -- is a 7nm die shrink of the original, with far more cores, more RAM, and 2.6 trillion transistors, with a "T." Makes the 54 billion on your average Nvidia A100 look a bit pedestrian, for a certain value of "pedestrian."

The concept of a wafer-scale engine is simple: Instead of etching dozens or hundreds of chips into a wafer and then packaging those CPUs or GPUs for individual resale, why not use an entire wafer (or most of a wafer, in this case) for one enormous processor?

People have tried this trick before, with no success, but that was before modern yields improved to the point where building 850,000 cores on a piece of silicon the size of a cutting board was a reasonable idea. Last year, the Cerebras WSE-1 raised eyebrows by offering 400,000 cores, 18GB of on-chip memory, and 9PB/s of memory bandwidth, with 100Pb/s of fabric bandwidth across the wafer. Today, the WSE-2 offers 850,000 cores, 40GB of on-chip SRAM memory, and 20PB/s of on-wafer memory bandwidth. Total fabric bandwidth has increased to 220Pb/s.

While the new WSE-2 is certainly bigger, there's not much sign it's different. The top-line stat improvements are all impressive, but the gains are commensurate across the board, which is to say: A 2.12x increase in core count is matched by a 2.2x increase in RAM, a 2.2x increase in memory bandwidth, and a 2.2x increase in fabric bandwidth. The actual amount of RAM, RAM bandwidth, or fabric bandwidth, evaluated on a per-core basis, is virtually identical between the two WSEs.

Normally, with a second-generation design like this, we'd expect the company to make some resource allocation changes or to scale out some specific aspect of the design, such as adjusting the ratios between core counts, memory bandwidth, and total RAM. The fact that Cerebras chose to scale the WSE-1 upwards into the WSE-2 without adjusting any other aspect of the design implies the company targeted its initial hardware well and was able to scale it upwards to meet the desires of its customer base without compromising or changing other aspects of the WSE architecture.

One of Cerebras' arguments in favor of its own designs is the simplicity of scaling a workload across a single WSE, rather than attempting to scale across the dozens or hundreds of GPUs that might be required to match its performance. It isn't clear how easy it is to adapt workloads to the WSE-1 or WSE-2, and there don't seem to be a lot of independent benchmarks available yet to compare scaling between the WSE-1 or WSE-2 and equivalent Nvidia cards. We would expect the WSE-2 to have the advantage in scaling, assuming the relevant workload fits the characteristics of both systems equally, due to the intrinsic difficulty of splitting a workload efficiently across an ever-larger number of accelerator cards.

Cerebras doesn't appear to have publicly published any benchmarks of the WSE-1 or WSE-2 comparing it against other systems, so we're still in a holding pattern as far as that kind of data. Moving on from the WSE-1 to the WSE-2 this quickly, however, does imply some customer interest in the chip.

Now Read:

Tagged In

WSE-2 WSE-1 Wafers Ai Cerebras

More from Extreme

Subscribe Today to get the latest ExtremeTech news delivered right to your inbox.
This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of use(Opens in a new window) and Privacy Policy. You may unsubscribe from the newsletter at any time.
Thanks for Signing Up