- Deixis Online - https://deixismagazine.org -

Laptop supercomputing

Logging onto a desktop or laptop computer entails a few mouse clicks. But gaining access to a supercomputer is akin to paying a lengthy visit to a Swiss bank before online banking became possible, says Sandia National Laboratories computer scientist Ronald Minnich.

“If you go to such a bank to deposit money or take money out, you fill out a form and you give it to somebody and then at some point later they come back and tell you how it went,” he says. “But you’re always at least one step away from your money. And that’s like the ‘batch processing’ model of supercomputing.

Laptop and supercomputer common set of files image

This is what it might look like if a user’s laptop and supercomputer had access to a common set of files. (Image courtesy of Ronald Minnich, Sandia National Laboratories.)

“What we’ve wanted to do is make it look like the supercomputer is directly attached to your laptop so that they appear to have a common set of files, and the laptop resources, such as the display, are directly available to the supercomputer. So it puts you, in a sense, directly in touch with the money.”

A distinguished technical staff member at Sandia’s Livermore, Calif., site, Minnich is principal investigator of a small research team developing novel software to give programmers what he calls “desktop extension” supercomputer access. His five-person collaboration has 1 million processing hours on the IBM Blue Gene/P supercomputer at Argonne National Laboratory near Chicago, awarded by the U.S. Department of Energy’s competitive INCITE (Innovative and Novel Computational Impact on Theory and Experiment) program.

Beating the batch is not easy in the supercomputing world. A fixture in the mainframe community since the 1950s, batch processing is a way of sharing time on highly prized computational resources by placing programs and data in a queue, processed in order, much as customers must stand in line at a bank waiting for an available teller. If there is a problem, a client goes to the back of the line and starts over.

As with time-sharing office printers, supercomputer programs are run from start to finish without stopping midstream for corrections or other adjustments. That contrasts with interactive processing that allows programmers to intervene on the fly.

Years ago, all computer programmers had to pre-write their programs onto sheets of paper that they then took to keypunch operators to convert into holes in cards. The punch cards were run though a reader, Minnich says, and eventually the programmers got back their results in an interoffice mailing envelope.

“It usually takes anywhere from five minutes to 10 hours before your program gets on the supercomputer,” Minnich says. “If your program started going off course in a direction you didn’t plan, with a batch you have to try and figure out afterwards what happened. And then you’ve got to sit there and resubmit that batch job.”

Although keypunchers have gone the way of telephone operators, programmers still submit their work to software queue managers who control the supercomputer. “You do a lot of work ahead of time. And in the end you surrender control.”

The batch system is like forcing a director who wants to alter a film to “go though the whole process of making the movie again to make a single change.” Minnich and his collaborators saw “value in being able to make supercomputer programs be interactive so you could observe what was going on and change them while they were running.”

The idea was to introduce desktop extension into Argonne’s Blue Gene/P using a version of a Bell Laboratories operating system called “Plan 9,” after the Ed Wood Jr. cult sci-fi movie “Plan 9 from Outer Space.”

The Plan 9 operating system anticipated network computing. The Blue Gene/P supercomputer’s current 40,000 nodes, each node with four cores, are networked in a three-dimensional torus that interconnects each node to six of its nearest neighbors for massively parallel operations at extremely high speeds. IBM says this system is designed to be scaled up to deliver more than 1 quadrillion floating point operations per second.

“That 3-D toroidal mesh greatly reduces the amount of time it takes to send a message from one place to another,” Minnich says. “If you send a message on a Blue Gene network, it’s going to get where it’s supposed to. That’s a guarantee.”

Plan 9, he adds, was designed as an improvement to the UNIX operating system. For desktop extension, a key Plan 9 feature is its ability to import files from other machines for sharing.

Once the team had ported Plan 9 to Blue Gene/P, adapting the operating system to support desktop access was no picnic, Minnich says. “You need to write a compiler to compile your code to run on this new kind of computer. And you have to make sure it all works when it’s run on the big machine. Then you have to get it to run fast. To get it to run fast you have to measure it. Then you have to deal with all of the things that break.”

If the code works on one of the Blue Gene’s nodes, “it doesn’t mean it will work on 10 or 100 or 1,000 of them. So you have to deal with all the problems that come as you run it on more and more of the machine. You’ve got a new computer, and you’ve never run your software on it and you’ve got to build every piece of that software.”

One remaining hurdle will be simple resistance to novelty, he adds. “I think what we’ve already learned is that it’s very hard to get people to change the way they do things to such a large extent. But some people who like this idea a lot would like to use it.”

The computer world is rife with such uncertainty and unexpected shifts of interest. While previously at Los Alamos National Laboratory in New Mexico, Minnich directed another team that created Clustermatic [1], a software package that enables groups of PCs to work much better together in high performance clusters. Clustermatic earned an R&D 100 award in 2004, but that team has since moved on to companies like Google and Cray.

“Clusters have had a huge role for a decade,” Minnich says. “But now they have been almost pushed aside. High-end computing has gone away from clusters of PCs and moved to machines like Blue Gene.”

(Visited 9,547 times, 1 visits today)