IBM Corp has landed two huge contracts with the U.S. government’s Department of Energy to build the two most powerful parallel supercomputers in the world.
The first machine, dubbed ASCI Purple, is the last of the supercomputers that the DOE is sponsoring under its Accelerated Strategic Computing Initiative and will result in IBM building a 100 teraflops parallel supercomputer from its pSeries Unix servers and Federation switch.
The second machine is Blue Gene/L, a 360 teraflops knock-off of the Blue Gene specialized supercomputer that IBM quietly announced it was building in November 2001. The ASCI Purple and Blue Gene/L machines together cost $290m.
Like ASCI Blue Pacific and ASCI White before them both machines will be housed at the Lawrence Livermore National Laboratory in California. ASCI Purple will be used by DOE researchers to simulate the detonation of aging nuclear weapons. These ASCI Blue Pacific and ASCI White machines were also based on IBM Unix servers, in this case RS/6000 SP PowerParallel servers connected together by IBM’s high-speed switches. The ASCI White supercomputer, delivered in August 2001, is comprised of 8,192 of IBM’s 32-bit 375MHz Power3 RISC processors, capable of delivering a peak 12.2 teraflops of processing power and about 7.2 teraflops where the compiler hits the chip. IBM’s ASCI Blue Pacific machine, delivered in October 1998, was built from the RS/6000 SPs as well, but based on 5,808 of IBM’s 32-bit PowerPC 604e processors. ASCI Blue Pacific had a peak rating of 3.9 teraflops, but tested out at about 2.1 teraflops in the Top 500 supercomputer Linpack ratings.
The first ASCI machine was ASCI Red, which was built for Sandia National Laboratory by Intel using 9,632 of its Pentium processors and which is currently rated at 2.4 teraflops (3.2 teraflops peak). Other ASCI machines include ASCI Blue Mountain, a massively parallel super built by SGO using 6,144 processors for Los Alamos National Laboratory and delivering 1.6 teraflops (3.1 teraflops peak) of processing capacity, and the two ASCI-Q supers at Los Alamos that each have 4,096 1.25GHz Alpha processors and deliver 7.7 teraflops (10.2 teraflops peak).
ASCI Purple was apparently given that name because when you mix red, white and blue – the colors of the American flag and the colors of the prior ASCI machines (provided you think of ASCI Q as a Big Red Q for Compaq) you get purple. The whole point of the ASCI program was to foment the creating of supercomputing technologies by indigenous U.S. server suppliers that were capable of simulating the detonation of an old nuclear missile, which has parts that are very old and which may not perform as originally specified. The U.S. government doesn’t want to perform underground nuclear testing to verify that its missiles still work, and it doesn’t want to simulate weapons on Japanese supercomputers, which a decade ago seemed like the only practical option without getting U.S. vendors back in the game. While the ASCI machines certainly can be used to test old nuclear weapons, they can also probably be used for something else: testing designs of new nuclear weapons. No one wants to talk about that, for obvious political reasons. No one is talking about who else was in on the bidding, either, for similar political reasons, but it is hard to believe that Cray wasn’t in on the bidding with its X1 machines, that HP didn’t pitch an Alpha or Itanium MPP, or that SGI didn’t throw its hat in the ring with future Origin machines.
ASCI Purple will be built from IBM’s future Armada servers, powered by Power5 and Power5+ chips. But IBM is being vague about the actual configuration, since it relates to future, unannounced commercial and technical server products.
IBM said that ASCI Purple will be comprised of 196 servers and a total of 12,544 processors. This number implies that IBM’s biggest Armada servers will be 64-way machines, compared to the 32-way Regattas it sells today as the pSeries 690 and the iSeries Model 890. But not so fast on the assumptions.
Power5 is expected to hit a peak of around 2GHz in early 2004 and will eventually hit 3GHz with the Power5+ in early 2005 or so, maybe even in late 2004 for big customers like the DOE. The Power5 and Power5+ chips will have two cores and will also include electronics to support simultaneous multithreading (SMT), like Intel’s Xeon processors do with their HyperThreading support.
A regular Power4 or Power4+ chip (which has two whole cores) can do eight floating point operations per second peak. Running at 3GHz, a 32-way Armada server using the Power5+ chip would have a raw peak 384 gigaflops of processing power, and 196 of them would have 75 teraflops of power. If SMT boosted throughput by 25%, this is how IBM would hit 100 teraflops. This machine would have 3,136 physical Power5+ chips, 6,272 Power cores, and would present 12,544 processor images to AIX 5L for processing. IBM could similarly build ASCI Purple using 2GHz Power5 processors, ignoring the SMT, and put 32 chips in a single Armada frame (for a total of 64 processors in a single system image). That would yield 512 gigaflops per Armada server of peak processing power and 100 teraflops of processing power across 196 servers. So here’s the question: When IBM implies 64-way Armadas, is it talking actual cores or virtual cores through SMT?
Exactly which configuration IBM takes with ASCI Purple is unclear, but it is known is that ASCI Purple will comprise 50 terabytes of main memory and will have 2 petabytes of disk storage. The whole shebang will eat 4.7 megawatts of power (enough to power 4,000 homes) and take up the space of two basketball courts. ASCI White, with one-eighth the processing power, 6.2 terabytes of memory, and 160 terabytes of disk capacity, consumed about 1.2 megawatts. IBM’s teraflops/megawatts ratio is improving.
While Livermore plays with ASCI Purple for the DOE’s weapons programs, Blue Gene/L will be available for the other two ASCI research centers – Los Alamos and Sandia – and their affiliates to do climate modeling, galaxy simulations, seismic analysis, oil exploration, and other research activities. The basic design for Blue Gene/L, which will run a stripped down Linux kernel and CNU Fortran and C, calls for two stripped-down PowerPC 440 cores, each rated at 2.8 gigaflops (which implies at 700MHz clock speed if this PowerPC core is similar to the Power4 cores), to be embedded on a single chip. These chips will each have 2GB of memory, and eight of them will be packaged on a single board with a total of 16GB of memory. IBM says it can pack 128 boards in a rack, and 64 of those racks will deliver a peak 360 teraflops using 65,536 processors and 128 terabytes of main memory. However, on some HPC applications, one of the two processors in the Blue Gene/L chip will be dedicated to message passing between nodes, and the effective peak power will drop to around 180 teraflops. firstname.lastname@example.org