Exploring electrons

Example structures generated in Construction Zone, including (left) a multigrain core-shell oxide nanoparticle with strain-mediated grain alignment, (center) a heavily-faceted gold nanoparticle on a carbon substrate decorated with molecule ligands and (rights) a series of gold nano-islands on a bilayer of molybdenum disulfide, an inorganic molybdenum-sulfur compound. (Image: Luis Rangel DaCosta.)

Materials scientist Luis Rangel DaCosta is applying machine learning methods to speed analyzing high-resolution images. The work fits Rangel DaCosta’s quest to understand puzzling phenomena. “When things don’t make sense, I get — I wouldn’t say concerned — but unsettled,” he says. “I can’t put that sort of thing down.”

The Department of Energy Computational Science Graduate Fellowship recipient has worked in Mary Scott’s group at the University of California, Berkeley, where researchers use advanced and high-resolution electron microscopy to discover how various materials’ functions arise from their atomic-level structures. Transmission electron microscopy (TEM), a technique that probes materials with high-energy electrons, produces images and other structural data. Advanced TEM can create unprecedented images of a material’s atoms, lattice structures and defects, says Rangel DaCosta. “If you know where all the atoms are in a material, you know how it works.”

Since joining Scott’s group, he has worked to analyze such images using neural networks, models in which layers of nodes that mimic biological neurons process data to solve complex problems, including image recognition. He built databases of simulated high-resolution TEM nanoparticle images and used them to train a series of neural networks. The result: a classification of pixel intensity needed to distinguish individual atoms in experimental images.

The aim is to replace analyzing TEM images by hand, a tedious, time-consuming, error-prone process. Rangel DaCosta’s approach to rapid analysis employs machine learning on the Perlmutter supercomputer at Lawrence Berkeley National Laboratory.

To use neural networks for machine learning, computer scientists must first train them, feeding them data that represent the problems they’re meant to solve. However, many neural networks designed to analyze TEM results from nanomaterials catch patterns that match training data precisely but fail to pick out slight variations of the same pattern type.

Rangel DaCosta worked with Berkeley Lab scientist Katherine Sytwu to understand why. They trained and validated neural networks on experimental TEM data on nanoparticles, using various approaches. Their analysis improved when they preprocessed output from different cameras into standardized pixel values. Such networks could analyze nanoparticles of different sizes and materials but faltered when samples were measured with varied settings, such as different magnification or electron dosage.

One widely known reason networks fail to generalize is a shortage of training data. Researchers need data from multiple samples of the same material type with small defects in different positions.

So Rangel DaCosta and his colleagues wrote a software package, called Construction Zone, which generates multiple variations of realistic nanomaterial structures. “Mechanical engineering folks would call it a CAD system — like computer-aided design — but for atoms,” he says.

In practicums with Anouar Benali at Argonne National Laboratory, Rangel DaCosta simulated excited electrons, or those with boosted energy and motion from interacting with a photon or another electron. Rangel DaCosta wanted to model individual electrons, so he used a method called configuration interaction, which treats each electron’s interactions with the nucleus and nearby electrons as an isolated quantum system.

“The idea is to calculate all the possible ways individual electrons can interact with one another in the orbitals,” he says, a number that can easily reach billions or trillions. Narrowing the interactions down to 10 million to 20 million important ones, Rangel DaCosta still had a massive number of calculations to perform. “The kicker: It’s extremely expensive, comically expensive, compared to density functional theory,” another widely used modeling method that can pinpoint only average electron positions.

In his first practicum, he calculated how molecules and crystals responded when they lost or gained an electron. Such calculations are relevant to experiments at the Advanced Photon Source at Argonne or the Advanced Light Source at Berkeley Lab.

In his second practicum, Rangel DaCosta rewrote a code to scale it for a high-performance machine such as Perlmutter. “So it was furious coding for a couple months,” he says. “I was basically putting together all the things I learned in my last practicum and in my program of study, a big culmination of the fellowship.”

As he wraps up his Ph.D., Rangel DaCosta is considering either running a lab of his own or the hands-on rewards of writing code. It’s not the first time that he’s felt torn between two paths. As a University of Michigan undergraduate, he juggled a double major in music performance and materials science and engineering, partly by thinking about science while he practiced the trombone. Does he still ponder science while he plays? “If I could avoid it, I would!”