Welcome to San Diego and the 2009 SciDAC conference. Over the next four days, I would like to present an assessment of the SciDAC program. We will look at where we've been, how we got to where we are and where we are going in the future.
Our vision is to be first in computational science, to be best in class in modeling and simulation. When Ray Orbach asked me what I would do, in my job interview for the SciDAC Director position, I said we would achieve that vision. And with our collective dedicated efforts, we have managed to achieve this vision.
In the last year, we have now the most powerful supercomputer for open science, Jaguar, the Cray XT system at the Oak Ridge Leadership Computing Facility (OLCF). We also have NERSC, probably the best-in-the-world program for productivity in science that the Office of Science so depends on. And the Argonne Leadership Computing Facility offers architectural diversity with its IBM Blue Gene/P system as a counterbalance to Oak Ridge.
There is also ESnet, which is often understated—the 40 gigabit per second dual backbone ring that connects all the labs and many DOE sites. In the President's Recovery Act funding, there is exciting news that ESnet is going to build out to a 100 gigabit per second network using new optical technologies. This is very exciting news for simulations and large-scale scientific facilities.
But as one noted SciDAC luminary said, it's not all about the computers—it's also about the science—and we are also achieving our vision in this area. Together with having the fastest supercomputer for science, at the SC08 conference, SciDAC researchers won two ACM Gordon Bell Prizes for the outstanding performance of their applications. The DCA++ code, which solves some very interesting problems in materials, achieved a sustained performance of 1.3 petaflops, an astounding result and a mark I suspect will last for some time. The LS3DF application for studying nanomaterials also required the development of a new and novel algorithm to produce results up to 400 times faster than a similar application, and was recognized with a prize for algorithm innovation—a remarkable achievement.
Day one of our conference will include examples of petascale science enabled at the OLCF. Although Jaguar has not been officially commissioned, it has gone through its acceptance tests, and during its shakedown phase there have been pioneer applications used for the acceptance tests, and they are running at scale. These include applications in the areas of astrophysics, biology, chemistry, combustion, fusion, geosciences, materials science, nuclear energy and nuclear physics.
We also have a whole compendium of science we do at our facilities; these have been documented and reviewed at our last SciDAC conference. Many of these were highlighted in our Breakthroughs Report. One session at this week's conference will feature a cross-section of these breakthroughs. In the area of scalable electromagnetic simulations, the Auxiliary-space Maxwell Solver (AMS) uses specialized finite element discretizations and multigrid-based techniques, which decompose the original problem into easier-to-solve subproblems. Congratulations to the mathematicians on this. Another application on the list of breakthroughs was the authentication of PETSc, which provides scalable solvers used in many DOE applications and has solved problems with over 3 billion unknowns and scaled to over 16,000 processors on DOE leadership-class computers. This is becoming a very versatile and useful toolkit to achieve performance at scale.
With the announcement of SIAM's first class of Fellows, we are remarkably well represented. Of the group of 191, more than 40 of these Fellows are in the 'DOE space.' We are so delighted that SIAM has recognized them for their many achievements.
In the coming months, we will illustrate our leadership in applied math and computer science by looking at our contributions in the areas of programming models, development and performance tools, math libraries, system software, collaboration, and visualization and data analytics. This is a large and diverse list of libraries. We have asked for two panels, one chaired by David Keyes and composed of many of the nation's leading mathematicians, to produce a report on the most significant accomplishments in applied mathematics over the last eight years, taking us back to the start of the SciDAC program. In addition, we have a similar panel in computer science to be chaired by Kathy Yelick. They are going to identify the computer science accomplishments of the past eight years. These accomplishments are difficult to get a handle on, and I'm looking forward to this report.
We will also have a follow-on to our report on breakthroughs in computational science and this will also go back eight years, looking at the many accomplishments under the SciDAC and INCITE programs. This will be chaired by Tony Mezzacappa.
So, where are we going in the SciDAC program? It might help to take a look at computational science and how it got started. I go back to Ken Wilson, who made the model and has written on computational science and computational science education. His model was thus: The computational scientist plays the role of the experimentalist, and the math and CS researchers play the role of theorists, and the computers themselves are the experimental apparatus. And that in simulation science, we are carrying out numerical experiments as to the nature of physical and biological sciences.
Peter Lax, in the same time frame, developed a report on large-scale computing in science and engineering. Peter remarked, 'Perhaps the most important applications of scientific computing come not in the solution of old problems, but in the discovery of new phenomena through numerical experimentation.'
And in the early years, I think the person who provided the most guidance, the most innovation and the most vision for where the future might lie was Ed Oliver. Ed Oliver died last year. Ed did a number of things in science. He had this personality where he knew exactly what to do, but he preferred to stay out of the limelight so that others could enjoy the fruits of his vision. We in the SciDAC program and ASCR Facilities are still enjoying the benefits of his vision. We will miss him.
Twenty years after Ken Wilson, Ray Orbach laid out the fundamental premise for SciDAC in an interview that appeared in SciDAC Review:
'SciDAC is unique in the world. There isn't any other program like it anywhere else, and it has the remarkable ability to do science by bringing together physical scientists, mathematicians, applied mathematicians, and computer scientists who recognize that computation is not something you do at the end, but rather it needs to be built into the solution of the very problem that one is addressing. '
As you look at the Lax report from 1982, it talks about how 'Future significant improvements may have to come from architectures embodying parallel processing elements—perhaps several thousands of processors.' And it continues, 'esearch in languages, algorithms and numerical analysis will be crucial in learning to exploit these new architectures fully.'
In the early '90s, Sterling, Messina and Smith developed a workshop report on petascale computing and concluded, 'A petaflops computer system will be feasible in two decades, or less, and rely in part on the continual advancement of the semiconductor industry both in speed enhancement and cost reduction through improved fabrication processes.' So they were not wrong, and today we are embarking on a forward look that is at a different scale, the exascale, going to 1018 flops.
In 2007, Stevens, Simon and Zacharia chaired a series of town hall meetings looking at exascale computing, and in their report wrote,
'Exascale computer systems are expected to be technologically feasible within the next 15 years, or perhaps sooner. These systems will push the envelope in a number of important technologies: processor architecture, scale of multicore integration, power management and packaging.'
The concept of computing on the Jaguar computer involves hundreds of thousands of cores, as do the IBM systems that are currently out there. So the scale of computing with systems with billions of processors is staggering to me, and I don't know how the software and math folks feel about it.
We have now embarked on a road toward extreme scale computing. We have created a series of town hall meetings and we are now in the process of holding workshops that address what I call within the DOE speak 'the mission need,' or what is the scientific justification for computing at that scale. We are going to have a total of 13 workshops. The workshops on climate, high energy physics, nuclear physics, fusion, and nuclear energy have been held. The report from the workshop on climate is actually out and available, and the other reports are being completed. The upcoming workshops are on biology, materials, and chemistry; and workshops that engage science for nuclear security are a partnership between NNSA and ASCR. There are additional workshops on applied math, computer science, and architecture that are needed for computing at the exascale. These extreme scale workshops will provide the foundation in our office, the Office of Science, the NNSA and DOE, and we will engage the National Science Foundation and the Department of Defense as partners.
We envision a 10-year program for an exascale initiative. It will be an integrated R&D program initially—you can think about five years for research and development—that would be in hardware, operating systems, file systems, networking and so on, as well as software for applications. Application software and the operating system and the hardware all need to be bundled in this period so that at the end the system will execute the science applications at scale. We also believe that this process will have to have considerable investment from the manufacturers and vendors to be successful.
We have formed laboratory, university and industry working groups to start this process and formed a panel to look at where SciDAC needs to go to compute at the extreme scale, and we have formed an executive committee within the Office of Science and the NNSA to focus on these activities. We will have outreach to DoD in the next few months. We are anticipating a solicitation within the next two years in which we will compete this bundled R&D process.
We don't know how we will incorporate SciDAC into extreme scale computing, but we do know there will be many challenges. And as we have shown over the years, we have the expertise and determination to surmount these challenges.