Categories: FeaturedLawrence Livermore

Scaling the unknown

Tzanio Kolev, center, director of the Center for Efficient Exascale Discretizations (CEED) at Department of Energy’s Lawrence Livermore National Laboratory (LLNL), examines a simulation that benefits from the high-order methods CEED is developing, with LLNL colleagues Vladimir Tomov, left, and Veselin Dobrev. (Photo: George Kitrinos/LLNL.)

The next supercomputer frontier presents a journey into the unknown unlike any other, Tzanio Kolev says.

Exascale computers, the first of which are expected to begin operation in 2021, will perform a quintillion (a billion billion, or 10¹⁸) calculations per second, about eight times faster than Summit, the most powerful supercomputer in the world today. These machines will affect nearly every aspect of research and development, including climate modeling, combustion, additive manufacturing, subsurface flow, wind energy and the design and commercialization of small-module nuclear reactors.

But “nobody has ever built such machines,” says Kolev, director of the Center for Efficient Exascale Discretizations (CEED) at Department of Energy’s (DOE) Lawrence Livermore National Laboratory (LLNL). “The exascale environment is so complex, it requires the combined efforts of application and software developers and hardware vendors to get there.”

The center is part of the DOE’s Exascale Computing Project (ECP), a coordinated effort to prepare for exascale’s arrival. CEED’s interdisciplinary process, known as co-design, is central to the ECP’s approach to ensure the exascale ecosystem it develops can meet the challenges a range of DOE applications present.

Moving to exascale supercomputing will require a different way of doing things. Moore’s Law, which said computers double in power every year or two, is no longer valid.

“The only way to increase the amount of computation we can perform is by increasing parallelism,” in which computer processors work cooperatively to solve problems, Kolev says. “And not just increasing it but making it work on heterogeneous hardware having multiple levels of memory that is much more complex than we’re used to.”

CEED is one of five co-design centers in the Exascale Computing Project, a collaboration between DOE’s Office of Science and National Nuclear Security Administration. The centers facilitate cooperation between the ECP’s supercomputer vendors, application scientists and hardware and software specialists. The CEED collaboration alone encompasses more than 30 researchers at two DOE national laboratories – LLNL and Argonne – and five universities.

“We have a really amazing team,” Kolev says. “One of our achievements is how well we work together. That doesn’t always happen with such multi-institutional teams.”

Team members grapple with a multitude of tradeoffs involved in designing and producing next-generation supercomputer hardware and software – the heart of co-design. One compromise, for example, is the amount of memory a computer will have and how fast that memory can be accessed. The size, speed and number of memory layers are dictated by algorithms that will use them and how those codes are expected to perform on a given hardware configuration. At a higher level, exascale computers also must hold down fixed costs and energy consumption. “These machines can easily consume enormous amounts of power,” Kolev says.

CEED focuses on high-order finite-element discretization algorithms, a way of dividing up big problems that could increase performance by orders of magnitude over traditional methods. “High-order is really a family of mathematically sophisticated algorithms that have actually been around for a long time, but they’ve always been considered very expensive,” Kolev says, requiring too many floating-point operations (those involving numbers with fractions) to make them worthwhile.

“Well, now things have changed with the advances in computer architectures.” Memory access has become the main brake on computer speed. “The time-to-solution performance that you see is no longer limited by how many floating-point operations you perform in your application. What really matters is how much memory traffic you have, and so, all these floating-point operations that were criticized before actually become an advantage nowadays. They allow you to fully utilize the hardware.”

Low-order methods entail “a quick computation because there’s very little to compute, then you’re waiting and waiting to get the next data from memory,” Kolev explains. In contrast, high-order methods perform many computations on the data brought from memory, potentially achieving optimal performance and leading to fast, efficient and accurate simulations on modern hardware, such as multicore processors and general-purpose graphics processing units (GPUs).

Among CEED’s contributions is NekCEM/Nek5000, an open-source simulation software package that delivers highly accurate solutions for a spectrum of scientific applications, including electromagnetics, quantum optics, fluid flow, thermal convection, combustion and magnetohydrodynamics. The code received a 2016 R&D 100 Award from R&D magazine as one of the year’s top new technologies.

And Kolev’s team recently issued its CEED 1.0 software distribution, which bundles the 13 code packages the center is developing. Most of them connect to libCEED, the center’s new API, or application program interface, library. The API allows programmers to easily access CEED’s high-performance algorithms from a variety of applications.

CEED researchers are improving the MFEM C++ software library, a main component of the center’s applications and finite-element focus areas, by improving algorithms for high-order finite element methods and developing optimized kernels for high-performance hardware, such as GPUs.

The co-design center also produced Laghos, the first ever mini-application, or miniapp, for high-order compressible Lagrangian flow, a fluid-flow phenomenon relevant to LLNL and DOE research. The miniapp gives the CEED team a way to share concrete, simplified physics codes with vendors in a relatively controlled setting, garnering feedback to improve performance in the full application. Using such simplified proxies for large-scale codes is central to co-design, as it enables vendors, computer scientists and mathematicians to focus on a big application’s essential performance characteristics without diving into complicated physics.

The Lagrangian miniapp simulates the movement of fluids or gases at high speeds and under extreme pressures in a computational mesh that moves with the flow. Using high-order methods lets the Laghos meshes deform with the fluid, representing the development of complex interactions such as a vortex or a shear – an impossible task for low-order meshes. “With high-order mesh elements, the whole element could take a very complex shape, follow the flow, still be in the Lagrangian frame and not tangle, not self-intersect,” Kolev says. “This allows us to push these types of Lagrangian simulations much further than it was possible in the past.”

Kolev expects the miniapp to benefit computer simulation of experiments at the LLNL-based National Ignition Facility, which is working to achieve nuclear fusion in the laboratory. Fusion, the process that powers the sun, could provide a revolutionary, nearly limitless energy source.

CEED’s work has already had a significant impact on several applications, including MARBL, a LLNL code to simulate high energy density physics at the exascale, says Robert Rieben, who leads the project. Such simulations support the Stockpile Stewardship Program, which maintains the safety, security and reliability of the U.S. nuclear deterrent without full-scale testing.

MARBL is built on the MFEM mathematical discretization framework, and Laghos serves as a proxy for one of MARBL’s components.

“CEED’s main contributions to the project are the performance optimizations it enables in the MFEM library, the development and maintenance of the Laghos miniapp, and algorithmic advancements in finite element methods that enable high-performance computing (being) integrated into the MARBL code,” Kolev says.