Table of contents

Volume 46

2006

Previous issue Next issue

SciDAC 2006, SCIENTIFIC DISCOVERY THROUGH ADVANCED COMPUTING 25–29 June 2006, Denver, Colorado, USA

Published online: 20 September 2006

PREFACE

E01
The following article is Open access

The second annual Scientific Discovery through Advanced Computing (SciDAC) Conference was held from June 25–29, 2006 at the new Hyatt Regency Hotel in Denver, Colorado. This conference showcased outstanding SciDAC-sponsored computational science results achieved during the past year across many scientific domains, with an emphasis on science at scale. Exciting computational science that has been accomplished outside of the SciDAC program both nationally and internationally was also featured to help foster communication between SciDAC computational scientists and those funded by other agencies. This was illustrated by many compelling examples of how domain scientists collaborated productively with applied mathematicians and computer scientists to effectively take advantage of terascale computers (capable of performing trillions of calculations per second) not only to accelerate progress in scientific discovery in a variety of fields but also to show great promise for being able to utilize the exciting petascale capabilities in the near future.

The SciDAC program was originally conceived as an interdisciplinary computational science program based on the guiding principle that strong collaborative alliances between domain scientists, applied mathematicians, and computer scientists are vital to accelerated progress and associated discovery on the world's most challenging scientific problems. Associated verification and validation are essential in this successful program, which was funded by the US Department of Energy Office of Science (DOE OS) five years ago. As is made clear in many of the papers in these proceedings, SciDAC has fundamentally changed the way that computational science is now carried out in response to the exciting challenge of making the best use of the rapid progress in the emergence of more and more powerful computational platforms. In this regard, Dr. Raymond Orbach, Energy Undersecretary for Science at the DOE and Director of the OS has stated: `SciDAC has strengthened the role of high-end computing in furthering science. It is defining whole new fields for discovery.' (SciDAC Review, Spring 2006, p8).

Application domains within the SciDAC 2006 conference agenda encompassed a broad range of science including: (i) the DOE core mission of energy research involving combustion studies relevant to fuel efficiency and pollution issues faced today and magnetic fusion investigations impacting prospects for future energy sources; (ii) fundamental explorations into the building blocks of matter, ranging from quantum chromodynamics – the basic theory that describes how quarks make up the protons and neutrons of all matter – to the design of modern high-energy accelerators; (iii) the formidable challenges of predicting and controlling the behavior of molecules in quantum chemistry and the complex biomolecules determining the evolution of biological systems; (iv) studies of exploding stars for insights into the nature of the universe; and (v) integrated climate modeling to enable realistic analysis of earth's changing climate. Associated research has made it quite clear that advanced computation is often the only means by which timely progress is feasible when dealing with these complex, multi-component physical, chemical, and biological systems operating over huge ranges of temporal and spatial scales. Working with the domain scientists, applied mathematicians and computer scientists have continued to develop the discretizations of the underlying equations and the complementary algorithms to enable improvements in solutions on modern parallel computing platforms as they evolve from the terascale toward the petascale regime. Moreover, the associated tremendous growth of data generated from the terabyte to the petabyte range demands not only the advanced data analysis and visualization methods to harvest the scientific information but also the development of efficient workflow strategies which can deal with the data input/output, management, movement, and storage challenges. If scientific discovery is expected to keep apace with the continuing progression from tera- to petascale platforms, the vital alliance between domain scientists, applied mathematicians, and computer scientists will be even more crucial. During the SciDAC 2006 Conference, some of the future challenges and opportunities in interdisciplinary computational science were emphasized in the Advanced Architectures Panel and by Dr. Victor Reis, Senior Advisor to the Secretary of Energy, who gave a featured presentation on `Simulation, Computation, and the Global Nuclear Energy Partnership.'

Overall, the conference provided an excellent opportunity to highlight the rising importance of computational science in the scientific enterprise and to motivate future investment in this area. As Michael Strayer, SciDAC Program Director, has noted: `While SciDAC may have started out as a specific program, Scientific Discovery through Advanced Computing has become a powerful concept for addressing some of the biggest challenges facing our nation and our world.' Looking forward to next year, the SciDAC 2007 Conference will be held from June 24–28 at the Westin Copley Plaza in Boston, Massachusetts. Chairman: David Keyes, Columbia University.

The Organizing Committee for the SciDAC 2006 Conference would like to acknowledge the individuals whose talents and efforts were essential to the success of the meeting. Special thanks go to Betsy Riley for her leadership in building the infrastructure support for the conference, for identifying and then obtaining contributions from our corporate sponsors, for coordinating all media communications, and for her efforts in organizing and preparing the conference proceedings for publication; to Tim Jones for handling the hotel scouting, subcontracts, and exhibits and stage production; to Angela Harris for handling supplies, shipping, and tracking, poster sessions set-up, and for her efforts in coordinating and scheduling the promotional activities that took place during the conference; to John Bui and John Smith for their superb wireless networking and A/V set-up and support; to Cindy Latham for Web site design, graphic design, and quality control of proceedings submissions; and to Pamelia Nixon-Hartje of Ambassador for budget and quality control of catering. We are grateful for the highly professional dedicated efforts of all of these individuals, who were the cornerstones of the SciDAC 2006 Conference. Thanks also go to Angela Beach of the ORNL Conference Center for her efforts in executing the contracts with the hotel, Carolyn James of Colorado State for on-site registration supervision, Lora Wolfe and Brittany Hagen for administrative support at ORNL, and Dami Rich and Andrew Sproles for graphic design and production. We are also most grateful to the Oak Ridge National Laboratory, especially Jeff Nichols, and to our corporate sponsors, Data Direct Networks, Cray, IBM, SGI, and Institute of Physics Publishing for their support.

We especially express our gratitude to the featured speakers, invited oral speakers, invited poster presenters, session chairs, and advanced architecture panelists and chair for their excellent contributions on behalf of SciDAC 2006. We would like to express our deep appreciation to Lali Chatterjee, Graham Douglas, Margaret Smith, and the production team of Institute of Physics Publishing, who worked tirelessly to publish the final conference proceedings in a timely manner.

Finally, heartfelt thanks are extended to Michael Strayer, Associate Director for OASCR and SciDAC Director, and to the DOE program managers associated with SciDAC for their continuing enthusiasm and strong support for the annual SciDAC Conferences as a special venue to showcase the exciting scientific discovery achievements enabled by the interdisciplinary collaborations championed by the SciDAC program.

OPENING REMARKS

E02
The following article is Open access

Good morning. Welcome to SciDAC 2006 and Denver. I share greetings from the new Undersecretary for Energy, Ray Orbach.

Five years ago SciDAC was launched as an experiment in computational science. The goal was to form partnerships among science applications, computer scientists, and applied mathematicians to take advantage of the potential of emerging terascale computers.

This experiment has been a resounding success. SciDAC has emerged as a powerful concept for addressing some of the biggest challenges facing our world.

As significant as these successes were, I believe there is also significance in the teams that achieved them. In addition to their scientific aims these teams have advanced the overall field of computational science and set the stage for even larger accomplishments as we look ahead to SciDAC-2.

I am sure that many of you are expecting to hear about the results of our current solicitation for SciDAC-2. I'm afraid we are not quite ready to make that announcement. Decisions are still being made and we will announce the results later this summer. Nearly 250 unique proposals were received and evaluated, involving literally thousands of researchers, postdocs, and students. These collectively requested more than five times our expected budget. This response is a testament to the success of SciDAC in the community.

In SciDAC-2 our budget has been increased to about $70 million for FY 2007 and our partnerships have expanded to include the Environment and National Security missions of the Department. The National Science Foundation has also joined as a partner.

These new partnerships are expected to expand the application space of SciDAC, and broaden the impact and visibility of the program. We have, with our recent solicitation, expanded to turbulence, computational biology, and groundwater reactive modeling and simulation. We are currently talking with the Department's applied energy programs about risk assessment, optimization of complex systems – such as the national and regional electricity grid, carbon sequestration, virtual engineering, and the nuclear fuel cycle.

The successes of the first five years of SciDAC have demonstrated the power of using advanced computing to enable scientific discovery. One measure of this success could be found in the President's State of the Union address in which President Bush identified 'supercomputing' as a major focus area of the American Competitiveness Initiative.

Funds were provided in the FY 2007 President's Budget request to increase the size of the NERSC-5 procurement to between 100–150 teraflops, to upgrade the LCF Cray XT3 at Oak Ridge to 250 teraflops and acquire a 100 teraflop IBM BlueGene/P to establish the Leadership computing facility at Argonne. We believe that we are on a path to establish a petascale computing resource for open science by 2009.

We must develop software tools, packages, and libraries as well as the scientific application software that will scale to hundreds of thousands of processors. Computer scientists from universities and the DOE's national laboratories will be asked to collaborate on the development of the critical system software components such as compilers, light-weight operating systems and file systems.

Standing up these large machines will not be business as usual for ASCR. We intend to develop a series of interconnected projects that identify cost, schedule, risks, and scope for the upgrades at the LCF at Oak Ridge, the establishment of the LCF at Argonne, and the development of the software to support these high-end computers.

The critical first step in defining the scope of the project is to identify a set of early application codes for each leadership class computing facility. These codes will have access to the resources during the commissioning phase of the facility projects and will be part of the acceptance tests for the machines.

Applications will be selected, in part, by breakthrough science, scalability, and ability to exercise key hardware and software components. Possible early applications might include climate models; studies of the magnetic properties of nanoparticles as they relate to ultra-high density storage media; the rational design of chemical catalysts, the modeling of combustion processes that will lead to cleaner burning coal, and fusion and astrophysics research.

I have presented just a few of the challenges that we look forward to on the road to petascale computing. Our road to petascale science might be paraphrased by the quote from e e cummings, 'somewhere I have never traveled, gladly beyond any experience . . .'

KEYNOTE

E03
The following article is Open access

Dr. Victor Reis delivered the keynote talk at the closing session of the conference. The talk was forward looking and focused on the importance of advanced computing for large-scale nuclear energy goals such as Global Nuclear Energy Partnership (GNEP). Dr. Reis discussed the important connections of GNEP to the Scientific Discovery through Advanced Computing (SciDAC) program and the SciDAC research portfolio.

In the context of GNEP, Dr. Reis talked about possible fuel leasing configurations, strategies for their implementation, and typical fuel cycle flow sheets.

A major portion of the talk addressed lessons learnt from 'Science Based Stockpile Stewardship' and the Accelerated Strategic Computing Initiative (ASCI) initiative and how they can provide guidance for advancing GNEP and SciDAC goals.

Dr. Reis's colorful and informative presentation included international proverbs, quotes and comments, in tune with the international flavor that is part of the GNEP philosophy and plan. He concluded with a positive and motivating outlook for peaceful nuclear energy and its potential to solve global problems.

An interview with Dr. Reis, addressing some of the above issues, is the cover story of Issue 2 of the SciDAC Review and available at http://www.scidacreview.org

This summary of Dr. Reis's PowerPoint presentation was prepared by Institute of Physics Publishing, the complete PowerPoint version of Dr. Reis's talk at SciDAC 2006 is given as a multimedia attachment to this summary.

PUSHING THE FRONTIERS OF ENERGY RESEARCH

COMBUSTION ENERGY SCIENCE

1
The following article is Open access

, , , , , and

There is considerable technological interest in developing new fuel-flexible combustion systems that can burn fuels such as hydrogen or syngas. Lean premixed systems have the potential to burn these types of fuels with high efficiency and low NOx emissions due to reduced burnt gas temperatures. Although traditional Scientific approaches based on theory and laboratory experiment have played essential roles in developing our current understanding of premixed combustion, they are unable to meet the challenges of designing fuel-flexible lean premixed combustion devices. Computation, with its ability to deal with complexity and its unlimited access to data, has the potential for addressing these challenges. Realizing this potential requires the ability to perform high fidelity simulations of turbulent lean premixed flames under realistic conditions. In this paper, we examine the specialized mathematical structure of these combustion problems and discuss simulation approaches that exploit this structure. Using these ideas we can dramatically reduce computational cost, making it possible to perform high-fidelity simulations of realistic flames. We illustrate this methodology by considering ultra-lean hydrogen flames and discuss how this type of simulation is changing the way researchers study combustion.

16
The following article is Open access

, and

Application of the Large Eddy Simulation (LES) technique provides the formal ability to treat the wide range of multidimensional time and length scales that exist in turbulent reacting flows in a computationally feasible manner. The large energetic-scales are resolved directly. The small ''subgrid-scales'' are modeled. This allows simulation of the complex multiple-time multiple-length scale coupling between processes in a time-accurate manner. Treating the full range of scales is a critical requirement since turbulent processes are inherently coupled through a cascade of nonlinear interactions. This paper provides a perspective on LES and its application to turbulent combustion. In particular, the combination of LES, high-performance massively-parallel computing, and advanced experimental capabilities in combustion science offer unprecedented opportunities for synergistic high- fidelity investigations. Information from well-defined benchmark flames, using a combination of stateof- the-art experiments and detailed simulations that match the experimental conditions, present new opportunities to understand the central physics of turbulence-chemistry interactions. Understanding these fundamental physical processes, and developing advanced simulation capabilities that efficiently and accurately describe them, are crucial requirements for the development of next generation combustion systems. Results are shown that demonstrate the progression toward more complex systems, with emphasis placed on the fundamental issues of turbulence-chemistry interactions.

28
The following article is Open access

and

Fundamental simulations are used to investigate the ignition process of turbulent nheptane liquid fuel spray jets. A DNS quality Eulerian method is used to solve the carrier gas flow field, while a Lagrangian method is used to track the liquid fuel droplets. Two-way coupling between the phases is included through the exchange of mass, momentum and energy. A detailed mechanism with 33 species and 64 reactions is used to describe the chemical reactions. The simulation approach allows studies of larger scale interaction of sprays and turbulence, including evaporation, mixing, and detailed chemical reaction. Both time developing and spatially developing liquid spray jets are studied. The initial carrier gas temperature was 1500 K. Several cases were simulated with different droplet radii (from 10 microns to 30 microns) and two initial velocities (100 m/s and 150 m/s). In the time developing case it was found that evaporative cooling and turbulence mixing play important roles in the ignition process of liquid fuel spray jets. Ignition first occurs at the edges of the jets where the fuel mixture is lean, and the scalar dissipation rate and vorticity magnitude are low. For smaller droplets, ignition occurs later than larger droplets due to increased evaporative cooling. Higher initial droplet velocity enhances turbulence mixing and evaporative cooling. For smaller droplets, higher initial droplet velocity causes the ignition to occur earlier, whereas for larger droplets, higher initial droplet velocity delays the ignition time. In the spatially developing liquid jets, ignition and flame lift-off characteristics similar to diesel sprays are observed. Near the injector, combustion development progresses very rapidly along the stoichiometric surface. In the downstream region of the spray, combustion develops with steep temperature fronts in a flamelet mode.

38
The following article is Open access

, , , and

In recent years, due to the advent of high-performance computers and advanced numerical algorithms, direct numerical simulation (DNS) of combustion has emerged as a valuable computational research tool, in concert with experimentation. The role of DNS in delivering new Scientific insight into turbulent combustion is illustrated using results from a recent 3D turbulent premixed flame simulation. To understand the influence of turbulence on the flame structure, a 3D fully-resolved DNS of a spatially-developing lean methane-air turbulent Bunsen flame was performed in the thin reaction zones regime. A reduced chemical model for methane-air chemistry consisting of 13 resolved species, 4 quasi-steady state species and 73 elementary reactions was developed specifically for the current simulation. The data is analyzed to study possible influences of turbulence on the flame thickness. The results show that the average flame thickness increases, in qualitative agreement with several experimental results.

43
The following article is Open access

and

Premixed turbulent flames are of increasing practical importance and remain a significant research challenge in the combustion community. In this paper we discuss a numerical methodology we have developed that combines a low-Mach number formulation combined with adaptive mesh refinement to exploit the variation of scales associated with premixed flame simulations. This combined approach, when implemented on parallel computing hardware, improves computational efficiency by several orders of magnitude.

48
The following article is Open access

, , and

This paper describes a hybrid finite-difference method for the large-eddy simulation of compressible flows with low-numerical dissipation and structured adaptive mesh refinement (SAMR). A conservative flux-based approach is described with an explicit centered scheme used in turbulent flow regions while a weighted essentially non-oscillatory (WENO) scheme is employed to capture shocks. Three-dimensional numerical simulations of a Richtmyer-Meshkov instability are presented.

53
The following article is Open access

, , , and

We discuss recent developments in the application of high-order adaptive mesh refinement constructions in reacting flow computations. We present results pertaining to the time integration of coupled diffusive-convective terms in this context using a stabilized explicit Runge-Kutta-Chebyshev scheme. We also discuss chemical model reduction strategies, with a focus on the utilization of computational singular perturbation theory for generation of simplified chemical models. Starting from a detailed chemical mechanism for methane-air combustion, we examine a posteriori errors in flame species computed with a range of simplified mechanisms corresponding to a varying degree of model reduction.

58
The following article is Open access

, , , , and

Large-scale atomistic simulations are performed using both the molecular dynamics and direct simulation Monte Carlo algorithms. These simulations are used to investigate several aspects of turbulent behavior, focusing on the Rayleigh-Taylor instability, in which a heavy fluid lies on top of a light fluid in the presence of a gravitational field. The use of atomistic techniques allows us to capture various physical effects not resolved by more traditional continuum methods, such as the discontinuous breakup of flow features, and the effects of micro-scale fluctuations. In addition, we compare with both experiment and continuum simulations such properties as the initial growth spectrum of the interface, and the development in time of the mixing zone width.

FUSION ENERGY

63
The following article is Open access

, , , and

The onset and nonlinear evolution of Edge Localized Modes (ELMs) in toroidally confined plasmas are known to shed thermal energy from the edge of the confinement region, and may also affect the core plasma through nonlinear mode coupling. The physics of this process is not well understood, although the concomitant large bursts of thermal energy transport are a major concern for future burning plasma experiments. The evolution of ELMs is inherently nonlinear and analytic approaches are limited by the complexity of the problem. Save a handful of recent important theoretical works, the nonlinear consequences of ELMs are mainly unexplored. Recent developments in the NIMROD code [http://nimrodteam.org] have enabled the computational study of ELMs in tokamaks in the extended magnetohydrodynamic (MHD) framework, and a new initiative was formed to understand the basic physics of their nonlinear evolution. The results of these investigations are presented for both model equilibria and accurate reconstructions from the DIII-D experiment at General Atomics [http://fusion.gat.com/diii-d/]. These results show a filamentary high temperature structure propagating radially outward, which is strongly damped by experimentally relevant toroidal flow shear. Two fluid and gyroviscous terms are included linearly as a preliminary indication of these important physical effects, and stabilization of higher wave number modes is observed.

73
The following article is Open access

, , , and

Gyrokinetic particle simulation of fusion plasmas for studying turbulent transport on state-of-theart computers has a long history of important scientific discoveries. The primary examples are: (i) the identification of ion temperature gradient (ITG) drift turbulence as the most plausible process responsible for the thermal transport observed in tokamak experiments; (ii) the reduction of such transport due to the presence of zonal flows; (iii) the confinement scaling trends associated with size of the plasma and also with the ionic isotope species. With the availability of terascale computers in recent years, we have also been able to carry out simulations with improved physics fidelity using experimentally relevant parameters. Computationally, we have demonstrated that our lead Particle-in- Cell (PIC) code, the Gyrokinetic Turbulence Code (GTC), is portable, efficient, and scalable on various MPP platforms. Convergence studies with unprecedented phase-space resolution have also been carried out. Since petascale resources are expected to be available in the near future, we have also engaged in developing better physics models and more efficient numerical algorithms to take advantage of this exciting opportunity. For the near term, we are interested in understanding some basic physics issues related to burning plasmas experiments in International Thermonuclear Experimental Reactor (ITER) - a multi-billion dollar device to be constructed over the next decade. Our long range goal is to carry out integrated simulations for ITER plasmas for a wide range of temporal and spatial scales, including high-frequency short-wavelength wave heating, low-frequency meso-scale transport, and low-frequency large scale magnetohydrodynamic (MHD) physics on these computers.

82
The following article is Open access

, , , , , , and

The AORSA global-wave solver is combined with the CQL3D bounce-averaged Fokker-Planck code to simulate the quasilinear evolution of non-thermal distributions in ion cyclotron resonance heating of tokamak plasmas. A novel re-formulation of the quasilinear operator enables calculation of the velocity space diffusion coefficients directly from the global wave fields. To obtain self-consistency between the wave fields and particle distribution function, AORSA and CQL3D have been iteratively coupled using Python. The combined selfconsistent model is applied to minority ion heating in the Alcator C-Mod tokamak. Results show the formation of a 70 keV ion tail near the minority ion cyclotron resonance layer in approximate agreement with measurements from charge exchange neutral particle analyzers.

87
The following article is Open access

, , , , , , , , , et al

A gyrokinetic neoclassical solution for a diverted tokamak edge plasma has been obtained for the first time using the massively parallel Jaguar XT3 computer at Oak Ridge National Laboratory. The solutions show similar characteristics to the experimental observations: electric potential is positive in the scrape-off layer and negative in the H-mode layer, and the parallel rotation is positive in the scrape-off layer and at the inside boundary of the H-mode layer. However, the solution also makes a new physical discovery that there is a strong ExB convective flow in the scrape-off plasma. A general introduction to the edge simulation problem is also presented.

92
The following article is Open access

, and

Here the recent developments in GEM, a quite comprehensive Gyrokinetic Electromagnetic (GEM) turbulence code are described. GEM is a δf particle turbulence simulation code that has kinetic electrons and electromagnetic perturbations. The key elements of the GEM algorithm are: (1) the parallel canonical formulation of the gyrokinetic system of equations; (2) an adjustable split-weight scheme for kinetic electrons; and (3) a high-β algorithm for the Ampere's equation. Additionally, GEM use a two-dimensional (2D) domain decomposition and runs efficiently on a variety of high performance architectures. GEM is now extended to include arbitrary toroidal equilibrium profiles and flux-surface shapes. The domain is an arbitrarily sized toroidal slice with periodicity assumed in the toroidal direction. It is global radially and poloidally along the magnetic field. Results are presented that demonstrate the effect of plasma shaping on the Ion-Temperature-Gradient (ITG) driven instabilities. An example of nonlinear simulation of the finite-β modified ITG turbulence in general geometry is also given. Finally, collisionless Trapped Electron Modes (TEM) are investigated and shown here is the transition from the TEM dominated core region to the Drift-Wave dominated edge region as the density gradient increases.

97
The following article is Open access

, and

The sawtooth instability is one of the most fundamental dynamics of an inductive tokamak discharge such as will occur in ITER. The sawtooth occurs when the current peaks in a tokamak, creating a region in the core where the safety factor is less than unity, q<1. While this instability is confined to the center of the plasma in low-pressure, low-current, large-aspect-ratio discharges, under certain conditions it can create magnetic islands at the outer resonant surfaces or set off a sequence of events that leads to a major disruption. Sawtooth behavior is complex and remains incompletely explained. The SciDAC Center for Extended MHD Modeling (CEMM) has undertaken an ambitious campaign to model this periodic motion as accurately as possible using the most complete fluid-like description of the plasma - the Extended MHD model. The multiple time and space scales associated with the reconnection layer and growth time make this an extremely challenging computational problem. The most recent simulation by the M3D code used over 500,000 elements for 400,000 partially implicit time steps for a total of 2×1011 space-time points, and there still remain some resolution issues. However, these calculations are providing insight into the nonlinear mechanisms of surface breakup and healing. We have been able to match many features of a small tokamak and can now project to the computational requirements for simulations of larger, hotter devices such as ITER.

102
The following article is Open access

, , and

This paper outlines a strategy to significantly enhance scientific collaborations in both Fusion Energy Sciences and in High-Energy Physics through the development and deployment of new tools and technologies into working environments. This strategy is divided into two main elements, collaborative workspaces and secure computational services. Experimental and theory/computational programs will greatly benefit through the provision of a flexible, standards-based collaboration space, which includes advanced tools for ad hoc and structured communications, shared applications and displays, enhanced interactivity for remote data access applications, high performance computational services and an improved security environment. The technologies developed should be prototyped and tested on the current generation of experiments and numerical simulation projects. At the same time, such work should maintain a strong focus on the needs of the next generation of mega-projects, ITER and the ILC. Such an effort needs to leverage existing computer science technology and take full advantage of commercial software wherever possible. This paper compares the requirements of FES and HEP, discuss today's solutions, examine areas where more functionality is required, and discuss those areas with sufficient overlap in requirements that joint research into collaborative technologies will increase the benefit to both.

SCIENTIFIC DISCOVERY AT ALL SCALES

LATTICE QCD

107
The following article is Open access

I describe the recent success in performing accurate calculations of the effects of the strong force on particles containing bottom and charm quarks. Since quarks are never seen in isolation, and so cannot be studied directly, numerical simulations are key to understanding the properties of these particles and extracting information about the quarks. The results have direct impact on the worldwide experimental programme that is aiming to determine the parameters of the Standard Model of particle physics precisely and thereby uncover or constrain the possibilities for physics beyond the Standard Model. The numerical simulation of the strong force is a huge computational task and the recent success is the result of international collaboration in developing techniques that are fast enough to do the calculations on powerful supercomputers.

122
The following article is Open access

At high temperatures or densities matter formed by strongly interacting elementary particles (hadronic matter) is expected to undergo a transition to a new form of matter - the quark gluon plasma - in which elementary particles (quarks and gluons) are no longer confined inside hadrons but are free to propagate in a thermal medium much larger in extent than the typical size of a hadron. The transition to this new form of matter as well as properties of the plasma phase are studied in large scale numerical calculations based on the theory of strong interactions - Quantum Chromo Dynamics (QCD). Experimentally properties of hot and dense elementary particle matter are studied in relativistic heavy ion collisions such as those currently performed at the relativistic heavy ion collider (RHIC) at BNL.

We review here recent results from studies of thermodynamic properties of strongly interacting elementary particle matter performed on Teraflops-Computer. We present results on the QCD equation of state and discuss the status of studies of the phase diagram at non-vanishing baryon number density.

132
The following article is Open access

Lattice QCD calculations demand a substantial amount of computing power in order to achieve the high precision results needed to better understand the nature of strong interactions, assist experiment to discover new physics, and predict the behavior of a diverse set of physical systems ranging from the proton itself to astrophysical objects such as neutron stars. However, computer power alone is clearly not enough to tackle the calculations we need to be doing today. A steady stream of recent algorithmic developments has made an important impact on the kinds of calculations we can currently perform. In this talk I am reviewing these algorithms and their impact on the nature of lattice QCD calculations performed today.

142
The following article is Open access

, , , , , and

Quantum chromodynamics (QCD) is the widely accepted theory of the strong interactions of quarks and gluons. Only through large scale numerical simulation has it been possible to work out the predictions of this theory for a vast range of phenomena relevant to the US Department of Energy experimental program. Such simulations are essential to support the discovery of new phenomena and more fundamental interactions.

With support from SciDAC the USQCD collaboration has developed software and prototyped custom computer hardware to carry out the required numerical simulations. We have developed a robust, portable data-parallel code suite. It provides a user-friendly basis for writing physics application codes for carrying out the calculations needed to predict the phenomenology of QCD. We are using this efficient and optimized code base to develop new physics application code, to improve the performance of legacy code, and to construct higher level tools, such as QCD-specific sparse matrix solvers.

We give a brief overview of the design of the data parallel API and its various components. We describe performance gains achieved in the past year. Finally, we present plans for further improvements under SciDAC-2.

147
The following article is Open access

In recent years, we used lattice QCD to calculate some quantities that were unknown or poorly known. They are the q2 dependence of the form factor in semileptonic DKlν decay, the leptonic decay constants of the D+ and Ds mesons, and the mass of the Bc meson. In this paper, we summarize these calculations, with emphasis on their (subsequent) confirmation by measurements in e+ e, γp and bar pp collisions

152
The following article is Open access

, , , , , , , and

Protons and neutrons have a rich structure in terms of their constituents, the quarks and gluons. Understanding this structure requires solving Quantum Chromodynamics (QCD). However QCD is extremely complicated, so we must numerically solve the equations of QCD using a method known as lattice QCD. Here we describe a typical lattice QCD calculation by examining our recent computation of the nucleon axial charge.

157
The following article is Open access

An vector extension to the C programming language utilizing Blue Gene/L floating point hardware is presented. The extensions are implemented in the GNU compiler collection toolchain and are available as a cross compiler.

161
The following article is Open access

The violation of charge conjugation (C) and parity (P) in kaon decays, first observed in Nobel-prize winning experiments at Brookhaven National Lab in 1964, is allowed by the Standard Model of particle physics. Predicting indirect CP violation in kaon decays requires the values for standard model parameters and the evaluation of a four-quark matrix element inside a kaon. Using lattice QCD with domain wall fermions and the current QCDOC computers, calculations are underway that will yield, among other results, the value for this matrix element, allowing more precise comparisons between theory and experiment for CP violation in kaon decays.

NUCLEAR STRUCTURE

166
The following article is Open access

The long-term vision of the Nuclear Structure and Low-Energy Reactions (NSLER) collaboration is to arrive at a comprehensive and unified description of nuclei and their reactions that is grounded in the interactions between the constituent nucleons. For this purpose, we will develop a universal energy density functional for nuclei and replace current phenomenological models of nuclear structure and reactions with a well-founded microscopic theory that will deliver maximum predictive power with minimal uncertainties that are well quantified. Nuclear structure and reactions play an essential role in the science to be investigated at rare isotope facilities, and in nuclear physics applications to the Science-Based Stockpile Stewardship Program, next-generation reactors, and threat reduction. We anticipate an expansion of the computational techniques and methods we currently employ, and developments of new treatments, to take advantage of petascale architectures and demonstrate the capability of the leadership class machines to deliver new science heretofore impossible.

ACCELERATOR DESIGN

171
The following article is Open access

Particle accelerators are among the most complex and versatile instruments of scientific exploration. They have enabled remarkable scientific discoveries and important technological advances that span all programs within the DOE Office of Science (DOE/SC). The importance of accelerators to the DOE/SC mission is evident from an examination of the DOE document, ''Facilities for the Future of Science: A Twenty-Year Outlook.'' Of the 28 facilities listed, 13 involve accelerators. Thanks to SciDAC, a powerful suite of parallel simulation tools has been developed that represent a paradigm shift in computational accelerator science. Simulations that used to take weeks or more now take hours, and simulations that were once thought impossible are now performed routinely. These codes have been applied to many important projects of DOE/SC including existing facilities (the Tevatron complex, the Relativistic Heavy Ion Collider), facilities under construction (the Large Hadron Collider, the Spallation Neutron Source, the Linac Coherent Light Source), and to future facilities (the International Linear Collider, the Rare Isotope Accelerator). The new codes have also been used to explore innovative approaches to charged particle acceleration. These approaches, based on the extremely intense fields that can be present in lasers and plasmas, may one day provide a path to the outermost reaches of the energy frontier. Furthermore, they could lead to compact, high-gradient accelerators that would have huge consequences for US science and technology, industry, and medicine. In this talk I will describe the new accelerator modeling capabilities developed under SciDAC, the essential role of multi-disciplinary collaboration with applied mathematicians, computer scientists, and other IT experts in developing these capabilities, and provide examples of how the codes have been used to support DOE/SC accelerator projects.

190
The following article is Open access

, , , , , , , , , et al

A highly efficient, fully parallelized, fully relativistic, three-dimensional particle-incell model for simulating plasma and laser wakefield acceleration is described. The model is based on the quasi-static approximation, which reduces a fully three-dimensional electromagnetic field solve and particle push to a two-dimensional field solve and particle push. This is done by calculating the plasma wake assuming that the drive beam and/or laser does not evolve during the time it takes for it to pass a plasma particle. The complete electromagnetic fields of the plasma wake and its associated index of refraction are then used to evolve the drive beam and/or laser using very large time steps. This algorithm reduces the computation time by 2 to 3 orders of magnitude without loss of accuracy for highly nonlinear problems of interest. The code is fully parallelizable with different domain decompositions for the 2D and 3D pieces of the code. The code also has dynamic load balancing. We present the basic algorithms and design of QuickPIC, as well as comparison between the new algorithm and conventional fully explicit models (OSIRIS). Direction for future work is also presented including a software pipeline technique to further scale QuickPIC to 10,000+ processors.

200
The following article is Open access

, , , , , , , , , et al

As the size and cost of particle accelerators escalate, high-performance computing plays an increasingly important role; optimization through accurate, detailed computermodeling increases performance and reduces costs. But consequently, computer simulations face enormous challenges. Early approximation methods, such as expansions in distance from the design orbit, were unable to supply detailed accurate results, such as in the computation of wake fields in complex cavities. Since the advent of message-passing supercomputers with thousands of processors, earlier approximations are no longer necessary, and it is now possible to compute wake fields, the effects of dampers, and self-consistent dynamics in cavities accurately. In this environment, the focus has shifted towards the development and implementation of algorithms that scale to large numbers of processors. So-called charge-conserving algorithms evolve the electromagnetic fields without the need for any global solves (which are difficult to scale up to many processors). Using cut-cell (or embedded) boundaries, these algorithms can simulate the fields in complex accelerator cavities with curved walls. New implicit algorithms, which are stable for any time-step, conserve charge as well, allowing faster simulation of structures with details small compared to the characteristic wavelength. These algorithmic and computational advances have been implemented in the VORPAL7 Framework, a flexible, object-oriented, massively parallel computational application that allows run-time assembly of algorithms and objects, thus composing an application on the fly.

205
The following article is Open access

, , , , and

The calculation of beam-beam effects has been an ongoing activity within SciDAC. We report the first validation of a detailed beam-beam simulation with data measured at the VEPP-2M collider in Novosbirsk. The validation of the simulation gives us con.dence to apply it to understanding and improving the operation of existing colliders such as the Tevatron, RHIC and LHC, and the design of the International Linear Collider.

210
The following article is Open access

, , , , , , , , , et al

We describe some of the accomplishments of the Beam Dynamics portion of the SciDAC Accelerator Science and Technology project. During the course of the project, our beam dynamics software has evolved from the era of different codes for each physical effect to the era of hybrid codes combining start-of-the-art implementations for multiple physical effects to the beginning of the era of true multi-physics frameworks. We describe some of the infrastructure that has been developed over the course of the project and advanced features of the most recent developments, the interplay betwen beam studies and simulations and applications to current machines at Fermilab. Finally we discuss current and future plans for simulations of the International Linear Collider.

215
The following article is Open access

, , , , , , , , , et al

Plasma-based lepton acceleration concepts are a key element of the long-term R&D portfolio for the U.S. Office of High Energy Physics. There are many such concepts, but we consider only the laser (LWFA) and plasma (PWFA) wakefield accelerators. We present a summary of electromagnetic particle-in-cell (PIC) simulations for recent LWFA and PWFA experiments. These simulations, including both time explicit algorithms and reduced models, have effectively used terascale computing resources to support and guide experiments in this rapidly developing field. We briefly discuss the challenges and opportunities posed by the near-term availability of petascale computing hardware.

CHEMISTRY

220
The following article is Open access

, , , , , and

Steady performance gains in computing power, as well as improvements in Scientific computing algorithms, are making possible the study of coupled physical phenomena of great extent and complexity. The software required for such studies is also very complex and requires contributions from experts in multiple disciplines. We have investigated the use of the Common Component Architecture (CCA) as a mechanism to tackle some of the resulting software engineering challenges in quantum chemistry, focusing on three specific application areas. In our first application, we have developed interfaces permitting solvers and quantum chemistry packages to be readily exchanged. This enables our quantum chemistry packages to be used with alternative solvers developed by specialists, remedying deficiencies we discovered in the native solvers provided in each of the quantum chemistry packages. The second application involves development of a set of components designed to improve utilization of parallel machines by allowing multiple components to execute concurrently on subsets of the available processors. This was found to give substantial improvements in parallel scalability. Our final application is a set of components permitting different quantum chemistry packages to interchange intermediate data. These components enabled the investigation of promising new methods for obtaining accurate thermochemical data for reactions involving heavy elements.

229
The following article is Open access

, , , , , , and

The approach taken in Ames to advance high-level electronic structure theory has been a combination of the development and implementation of new and novel methods with the continuing development of strategies to optimize scalable computing. This work summarizes advances on both fronts. Several new methods have been implemented under the Distributed Data Interface (DDI), most recently including analytic Hessians for both Hatree- Fock and CASSCF (complete active space self-consistent field) wavefunctions, gradients for restricted open shell second order perturbation theory, and the fragment molecular orbital method (FMO). Exciting new method developments include the FMO method and the CEEIS (Correlation Energy Extrapolation by Intrinsic Scaling) method for efficiently approaching the exact energy for atomic and molecular systems.

234
The following article is Open access

, and

The Interpolative Moving Least Squares (IMLS) fitting scheme is being developed for the purpose of fitting potential energy surfaces used in chemistry. IMLS allows for automatic surface generation in which the fitting method selects the positions at which expensive electronic structure calculations determine specific values on the surface. The resulting surfaces are necessary for accurate kinetics and dynamics.

239
The following article is Open access

, and

The Born-Oppenheimer separation of the Schrodinger equation allows the electronic and nuclear motions to be solved in three steps. 1) The solution of the electronic wave function at a discrete set of molecular conformations; 2) the fitting of this discrete set of energy values in order to construct an analytical approximation to the potential energy surface (PES) at all molecular conformations; 3) the use of this analytical PES to solve for the nuclear motion using either time-dependent or time-independent formulations to compute molecular energy values, chemical reaction rates, and cumulative reaction probabilities. This project involves the development of technology to address all three of these steps. This report focuses on our recent work on the optimization of nonlinear wave function parameters for the electronic wave functions.

244
The following article is Open access

and

There is a pressing need for accurate calculations of finite temperature ground and excited state properties of nanoscale systems relevant to structural biology, hydrogen economy, environmental and material science problems. To address this challenging task we have designed a multiscale dynamical approach that combines the accuracy and computational complexity of coupled-cluster (CC) methods with the efficiency of classical molecular dynamics simulations. Our methodology is based on a seamless integration between the generic QM/MM interface, Tensor Contraction Engine module, and the classical molecular dynamics module of NWChem and offers an unprecedented ability for accurate large scale calculations of thermodynamics of ground and excited state properties. We illustrate our approach by large scale dynamical simulation of the excited state spectrum of the cytosine base in its native DNA environment using a variant of the completely renormalized equation-of-motion method with singles, doubles, and non-iterative triples (CR-EOMCCSD(T)).

249
The following article is Open access

Today, state-of-the-art computational methods and programs for quantum theory for electron correlation may not be developed entirely manually. They are often constructed with the essential aid of symbolic algebra systems automating lengthy and error-prone mathematical derivation and computer implementation, inevitable in these developments. Recent progress in this new paradigm of chemical theory development—complete automation—is reviewed.

MATERIALS SCIENCE

254
The following article is Open access

, , , , , , and

We investigate solidification in metal systems ranging in size from 64,000 to 524,288,000 atoms on the IBM BlueGene/L computer at LLNL. Using the newly developed ddcMD code, we achieve performance rates as high as 103 TFlops, with a performance of 101.7 TFlop sustained over a 7 hour run on 131,072 cpus. We demonstrate superb strong and weak scaling. Our calculations are significant as they represent the first atomic-scale model of metal solidification to proceed, without finite size effects, from spontaneous nucleation and growth of solid out of the liquid, through the coalescence phase, and into the onset of coarsening. Thus, our simulations represent the first step towards an atomistic model of nucleation and growth that can directly link atomistic to mesoscopic length scales.

268
The following article is Open access

First-Principles Molecular Dynamics (FPMD) is an accurate atomistic simulation method that has been applied to numerous problems of materials science. Its ability to describe simultaneously the electronic structure and the dynamical properties of a given system make it a tool of choice for investigations of systems involving varying chemical environments. During the past decade, the advent of terascale computers has considerably enhanced the capabilities of FPMD. In this paper, we discuss recent progress in the implementation of First-Principles Molecular Dynamics on parallel computers. In particular, we consider the new challenges presented by current terascale computers and discuss the steps that will have to be taken to exploit efficiently future petascale architectures. Examples of large-scale FPMD applications using the Qbox code on the BlueGene/L computer are presented.

278
The following article is Open access

, , and

The paper presents the state-of-the-art algorithmic developments for simulating the fracture of disordered quasi-brittle materials using discrete lattice systems. Large scale simulations are often required to obtain accurate scaling laws; however, due to computational complexity, the simulations using the traditional algorithms were limited to small system sizes. We have developed two algorithms: a multiple sparse Cholesky downdating scheme for simulating 2D random fuse model systems, and a block-circulant preconditioner for simulating 2D random fuse model systems. Using these algorithms, we were able to simulate fracture of largest ever lattice system sizes (L = 1024 in 2D, and L = 64 in 3D) with extensive statistical sampling. Our recent simulations on 1024 processors of Cray-XT3 and IBM Blue-Gene/L have further enabled us to explore fracture of 3D lattice systems of size L = 200, which is a significant computational achievement. These largest ever numerical simulations have enhanced our understanding of physics of fracture; in particular, we analyze damage localization and its deviation from percolation behavior, scaling laws for damage density, universality of fracture strength distribution, size effect on the mean fracture strength, and finally the scaling of crack surface roughness.

292
The following article is Open access

, , , , , , , , , et al

The past ∼10 years have witnessed revolutionary breakthroughs both in synthesis of quantum dots (leading to nearly monodispersed, defect-free nanostructures) and in characterization of such systems, revealing ultra narrow spectroscopic lines of <1 meV width, exposing new intriguing effects, such as multiple exciton generation, fine-structure splitting, quantum entanglement, multiexciton recombination and more. These discoveries have led to new technological applications including quantum computing and ultra-high efficiency solar cells. Our work in this project is based on two realizations/observations: First, that the dots exhibiting clean and rich spectroscopic and transport characteristics are rather big. Indeed, the phenomenology indicated above is exhibited only by the well-passivated defect-free quantum dots containing at least a few thousand atoms (colloidal) and even a few hundred thousand atoms (self assembled). Understanding the behavior of nanotechnology devices requires the study of even larger, million-atom systems composed of multiple components such as wires+dots+films. Second, first-principles many-body computational techniques based on current approaches (Quantum Monte-Carlo, GW, Bethe-Salpeter) are unlikely to be adaptable to such large structures and, at the same time, the effective mass-based techniques are too crude to provide insights on the many-body/atomistic phenomenology revealed by experiment. Thus, we have developed a set of methods that use an atomistic approach (unlike effective-mass based techniques) and utilize single-particle + many body techniques that are readily scalable to ∼103−106 atom nanostructures. New mathematical and computational techniques have also been developed to accelerate our calculations and go beyond simple conjugate gradient based methods allowing us to study larger systems. In this short paper based on a poster presented at the DOE SciDAC06 conference we will present the overall structure as well as highlights of our computational nanoscience project.

299
The following article is Open access

, and

The brittle fracture of a gypsum cylinder, which is used as an artificial kidney stone in lithotripsy research, is simulated by the use of the finite element method. The cylinder is submerged in water and is subjected to a pressure front parallel to one of its planar faces. The stresses induced by the pressure wave lead to fracture in the interior of the cylinder, with the formation of a spall plane located about 2/3 of the length from the face on which the pressure is applied. We show that the simulation reproduces the salient features of experimental observations.

BIOLOGY

304
The following article is Open access

, , and

In this paper we summarize our work to bring modern numerical methods to bear in computations for density functional theories (DFTs) of inhomogeneous fluid systems. We present the general mathematical structure of the problem, and briefly discuss different strategies for solving the problems. Finally, we present a few recent results from calculations on complex peptide assemblies in lipid bilayers to demonstrate the application of the methods in one complex 3-dimensional system. We find that while solver strategies developed and optimized for partial differential equations (PDEs) can be applied to these systems of equations, they do not provide optimal solutions with respect to speed, memory use, or parallel partitioning.

311
The following article is Open access

, , , , , , and

Understanding the structure and dynamics of large biomolecular assemblies requires the development of new computational methods for (i) accurate structure prediction, (ii) molecular docking and (iii) long time-frame molecular simulation, and implementation on massively parallel computing infrastructure. This paper reviews our progress in these areas and applications on important molecular systems.

316
The following article is Open access

We have developed advanced numerical algorithms to model biological fluids in multiscale flow environments using the software framework developed under the SciDAC APDEC ISIC. The foundation of our computational effort is an approach for modeling DNA laden fluids as ''bead-rod'' polymers whose dynamics are fully coupled to an incompressible viscous solvent. The method is capable of modeling short range forces and interactions between particles using soft potentials and rigid constraints. Our methods are based on higher-order finite difference methods in complex geometry with adaptivity, leveraging algorithms and solvers in the APDEC Framework. Our Cartesian grid embedded boundary approach to incompressible viscous flow in irregular geometries has also been interfaced to a fast and accurate level-sets method within the APDEC Framework for extracting surfaces from volume renderings of medical image data and used to simulate cardio-vascular and pulmonary flows in critical anatomies.

322
The following article is Open access

, and

We discuss a modular modelling framework to rapidly develop mathematical models of bacterial cells that would explicitly link genomic details to cell physiology and population response. An initial step in this approach is the development of a coarse-grained model, describing pseudo-chemical interactions between lumped species. A hybrid model of interest can then be constructed by embedding genome-specific detail for a particular cellular subsystem (e.g. central metabolism), called here a module, into the coarse-grained model. Specifically, a new strategy for sensitivity analysis of the cell division limit cycle is introduced to identify which pseudo-molecular processes should be delumped to implement a particular biological function in a growing cell (e.g. ethanol overproduction or pathogen viability). To illustrate the modeling principles and highlight computational challenges, the Cornell coarsegrained model of Escherichia coli B/r-A is used to benchmark the proposed framework.

327
The following article is Open access

and

Proteins work as highly efficient machines at the molecular level and are responsible for a variety of processes in all living cells. There is wide interest in understanding these machines for implications in biochemical/biotechnology industries as well as in health related fields. Over the last century, investigations of proteins based on a variety of experimental techniques have provided a wealth of information. More recently, theoretical and computational modeling using large scale simulations is providing novel insights into the functioning of these machines. The next generation supercomputers with petascale computing power, hold great promises as well as challenges for the biomolecular simulation scientists. We briefly discuss the progress being made in this area.

334
The following article is Open access

and

Within computational biology, all-atom simulation is the most computationally demanding field, in terms of compute load, communication speed, and memory load. Here, we report molecular dynamics simulation results for the ribosome, using 2.64 × 106 atoms, the largest all-atom biomolecular simulation published to date. The ribosome is the largest asymmetric biological structure solved to date to atomic resolution (2.8 Å). While simulations requiring long-range electrostatic forces have been previously restricted to much smaller systems, breakthroughs in electrostatic force calculation and dynamic load balancing have enabled molecular dynamics simulations of large biomolecular complexes. The LANL Q Machine played a key role in enabling such large simulations to be performed. The LANL Q machine displays approximately 85% parallel scaling efficiency for the ribosome system on 1024 CPUs. Using the targeted molecular dynamics algorithm, we have simulated the ratelimiting step in genetic decoding by the ribosome. The simulations use experimentally determined ribosome structures in different functional states as the initial and final conditions, making our simulations entirely and rigorously consistent with these experimental data. The simulations have identified candidate 23S rRNA nucleotides important for the accommodation of tRNA into the ribosome during protein synthesis.

CLIMATE SCIENCE

343
The following article is Open access

, , , and

Atmospheric chemicals and aerosols are interactive components of the Earth system, with implications for climate. As part of the SciDAC climate consortium of labs we have implemented a flexible state-of-the-art atmospheric chemistry and aerosol capability into the Community Climate System Model (CCSM). We have also developed a fast chemistry mechanism that agrees well with observations and is computationally more efficient than our more complex chemistry mechanisms. We are working with other colleagues to couple this capability with the biospheric and aerosol-cloud interaction capabilities that are being developed for the CCSM model to create an Earth system model. However, to realise the potential of this Earth system model will require a move from terascale to petascale computing, and the greatest benefit will come from well balanced computers and a balance between capability and capacity computing.

351
The following article is Open access

, and

The development of the Coupled Colorado State Model (CCoSM) is ultimately motivated by the need to predict and study climate change. All components of CCoSM innovatively blend unique design ideas and advanced computational techniques. The atmospheric model combines a geodesic horizontal grid with a quasi-Lagrangian vertical coordinate to improve the quality of simulations, particularly that of moisture and cloud distributions. Here we briefly describe the dynamical core, physical parameterizations and computational aspects of the atmospheric model, and present our preliminary numerical results. We also briefly discuss the rational behind our design choices and selection of computational techniques.

356
The following article is Open access

, , and

The Community Atmosphere Model (CAM) is the atmospheric component of the Community Climate System Model (CCSM) and is the primary consumer of computer resources in typical CCSM simulations. Performance engineering has been an important aspect of CAM development throughout its existence. This paper briefly summarizes these efforts and their impacts over the past five years.

363
The following article is Open access

, , , , , , , , , et al

Described here is the formulation of the CASA' biogeochemistry model of Fung, et al., which has recently been coupled to the Community Land Model Version 3 (CLM3) and the Community Climate System Model Version 3 (CCSM3). This model is presently being used for Coupled Climate/Carbon Cycle Model Intercomparison Project (C4MIP) Phase 1 experiments. In addition, CASA' is one of three models – in addition to CN (Thornton, et al.) and IBIS (Thompson, et al.) – that are being run within CCSM to investigate their suitability for use in climate change predictions in a future version of CCSM. All of these biogeochemistry experiments are being performed on the Computational Climate Science End Station (Dr. Warren Washington, Principle Investigator) at the National Center for Computational Sciences at Oak Ridge National Laboratory.

ASTROPHYSICS AND COSMOLOGY

370
The following article is Open access

, , , and

The late stages of stellar evolution have great importance for the synthesis and dispersal of the elements heavier than helium. We focus on the helium shell flash in low mass stars, where incorporation of hydrogen into the convection zone above the helium burning shell can result in production of carbon-13 with tremendous release of energy. The need for detailed 3-D simulations in understanding this process is explained. To make simulations of the entire helium flash event practical, models of turbulent multimaterial mixing and nuclear burning must be constructed and validated. As an example of the modeling and validation process, our recent work on modeling subgrid-scale turbulence in 3-D compressible gas dynamics simulations is described and a new turbulence model presented along with supporting results. Finally, the potential impact of petascale computing hardware on this problem is explored.

385
The following article is Open access

, , , , and

Type Ia supernovae (SNe Ia) are the largest thermonuclear explosions in the Universe. Their light output can be seen across great distances and has led to the discovery that the expansion rate of the Universe is accelerating. Despite the significance of SNe Ia, there are still a large number of uncertainties in current theoretical models. Computational modeling offers the promise to help answer the outstanding questions. However, even with today's supercomputers, such calculations are extremely challenging because of the wide range of length and time scales. In this paper, we discuss several new algorithms for simulations of SNe Ia and demonstrate some of their successes.

393
The following article is Open access

, , , , , and

The overwhelming evidence that the core collapse supernova mechanism is inherently multidimensional, the complexity of the physical processes involved, and the increasing evidence from simulations that the explosion is marginal presents great computational challenges for the realistic modeling of this event, particularly in 3 spatial dimensions. We have developed a code scalable to computations in 3 dimensions that couples PPM Lagrangian-withremap hydrodynamics, multigroup flux-limited diffusion neutrino transport (with many improvements), and a nuclear network. The neutrino transport is performed in a ''ray-by-ray-plus'' approximation, wherein all the lateral effects of neutrinos are included (e.g., pressure, velocity corrections, advection) except lateral transport. A moving radial grid option permits the evolution to be carried out from initial core collapse with only modest demands on the number of radial zones. The inner part of the core is evolved after collapse, along with the rest of the core and mantle, by subcycling the lateral evolution near the center as demanded by the small Courant times. We present results of 2-D simulations of a symmetric and an asymmetric collapse of both a 15 and an 11 M progenitor. In each of these simulations we have discovered that once the oxygen-rich material reaches the shock there is a synergistic interplay between the reduced ram pressure, the energy released by the burning of the shock-heated oxygen-rich material, and the neutrino energy deposition that leads to a revival of the shock and an explosion.

403
The following article is Open access

, and

During the roughly 20 seconds it shines brightest, a gamma-ray burst (GRB) is over a billion times brighter, in electromagnetic radiation, than an ordinary supernova. The key difference is that GRBs emit some appreciable fraction of their kinetic energy in channeled ultra-relativistic outflows (Lorentz factor Γ > 200). Currently credible models point to rotation as the key factor required to generate the outflows. We explore here the collapse of the core a massive, rotating star to a black hole and accretion disk and the subsequent propagation of relativistic jets through the star. A variety of high energy transients may be observed based upon the energy of the jet and the angle at which the explosion is observed, but there may be a minimum energy for GRBs that last only tens of seconds.

408
The following article is Open access

, and

First results from a fully self-consistent, temperature-dependent equation of state that spans the density range of neutron stars and supernova cores above neutron drip density are presented. The equation of state (EoS) is calculated using a mean-field Hartree-Fock method in three dimensions (3D). The nuclear interaction is represented by the phenomenological Skyrme model in this work, but the EoS can be obtained in our framework for any suitable form of the nucleon-nucleon effective interaction. The scheme we employ naturally allows effects such as (i) neutron drip, which results in an external neutron gas, (ii) the variety of exotic nuclear shapes expected for extremely neutron heavy nuclei, and (iii) the subsequent dissolution of these nuclei into nuclear matter. In this way, the equation of state is calculated across phase transitions without recourse to interpolation techniques between density regimes described by different physical models. EoS tables are calculated in the wide range of densities, temperature and proton/neutron ratios on the ORNL NCCS XT3, using up to 2000 processors simultaneously.

413
The following article is Open access

and

We explore the evolution of thermonuclear supernova explosions when the progenitor white dwarf star ignites asymmetrically off-center. Several numerical simulations are carried out in two and three dimensions to test the consequences of different initial flame configurations such as spherical bubbles displaced from the center, more complex deformed configurations, and teardrop-shaped ignitions. The burning bubbles float towards the surface while releasing energy due to the nuclear reactions. If the energy release is too small to gravitationally unbind the star, the ash sweeps around it, once the burning bubble approaches the surface. Collisions in the fuel on the opposite side increase its temperature and density and may - in some cases - initiate a detonation wave which will then propagate inward burning the core of the star and leading to a strong explosion. However, for initial setups in two dimensions that seem realistic from pre-ignition evolution, as well as for all three-dimensional simulations the collimation of the surface material is found to be too weak to trigger a detonation.

418
The following article is Open access

, and

We describe a set of codes developed under the auspices of the Terascale Supernova Initiative to simulate the coherent, nonlinear evolution of the neutrino and antineutrino fields in the core collapse supernova environment. Ours are the first simulations to include quantum entanglement of neutrino flavor evolution on different neutrino trajectories. We find that neutrinos and antineutrinos can undergo coherent, collective transformations of their flavors which could affect supernova dynamics and nucleosynthesis.

ENABLING SCIENTIFIC DISCOVERY

APPLIED MATHEMATICS

422
The following article is Open access

and

The numerical simulation of plasmas is a critical tool for inertial confinement fusion (ICF). We have been working to improve the predictive capability of a continuum laser plasma interaction code pF3d, which couples a continuum hydrodynamic model of an unmagnetized plasma to paraxial wave equations modeling the laser light. Advanced numerical techniques such as local mesh refinement, multigrid, and multifluid Godunov methods have been adapted and applied to nonlinear heat conduction and to multifluid plasma models. We describe these algorithms and briefly demonstrate their capabilities.

433
The following article is Open access

, and

Computational scientists are grappling with increasingly complex, multi-rate applications that couple such physical phenomena as fluid dynamics, electromagnetics, radiation transport, chemical and nuclear reactions, and wave and material propagation in inhomogeneous media. Parallel computers with large storage capacities are paving the way for high-resolution simulations of coupled problems; however, hardware improvements alone will not prove enough to enable simulations based on brute-force algorithmic approaches. To accurately capture nonlinear couplings between dynamically relevant phenomena, often while stepping over rapid adjustments to quasi-equilibria, simulation scientists are increasingly turning to implicit formulations that require a discrete nonlinear system to be solved for each time step or steady state solution. Recent advances in iterative methods have made fully implicit formulations a viable option for solution of these large-scale problems. In this paper, we overview one of the most effective iterative methods, Newton-Krylov, for nonlinear systems and point to software packages with its implementation. We illustrate the method with an example from magnetically confined plasma fusion and briefly survey other areas in which implicit methods have bestowed important advantages, such as allowing high-order temporal integration and providing a pathway to sensitivity analyses and optimization. Lastly, we overview algorithm extensions under development motivated by current SciDAC applications.

443
The following article is Open access

, , , , , , , and

Multigrid methods are ideal for solving the increasingly large-scale problems that arise in numerical simulations of physical phenomena because of their potential for computational costs and memory requirements that scale linearly with the degrees of freedom. Unfortunately, they have been historically limited by their applicability to elliptic-type problems and the need for special handling in their implementation. In this paper, we present an overview of several recent theoretical and algorithmic advances made by the TOPS multigrid partners and their collaborators in extending applicability of multigrid methods. specific examples that are presented include quantum chromodynamics, radiation transport, and electromagnetics.

453
The following article is Open access

, , , , , , and

Combinatorial algorithms have long played a crucial enabling role in scientific and engineering computations. The importance of discrete algorithms continues to grow with the demands of new applications and advanced architectures. This paper surveys some recent developments in this rapidly changing and highly interdisciplinary field.

458
The following article is Open access

Accurate and efficient numerical solution of partial differential equations requires well-formed meshes that are non-inverted, smooth, well-shaped, oriented, and size-adapted. The Mesquite mesh quality improvement toolkit is a software library that applies optimization algorithms to create well-formed meshes via node movement. Mesquite can be run standalone using drivers or called directly from an application code. Mesquite can play an essential role in the SLAC accelerator design program as a component in automatic shape optimization software and in manufacturing defect-correction studies to smoothly deform meshes in response to geometric domain deformations guided by the optimization of design parameters. Mesquite has also been applied to problems in fusion, biology, and propellant burn studies.

463
The following article is Open access

, , and

In this document, the derivation of a multiscale stabilized finite element method for the streamfunction formulation of the two-dimensional, incompressible Navier-Stokes equations is motivated and outlined. A linearized model problem is developed and analyzed through a variational multiscale approach to determine the form of the stabilized terms.

COMPUTER SCIENCE

468
The following article is Open access

, , , , , , and

Although an increasing amount of middleware has emerged in the last few years to achieve remote data access, distributed job execution, and data management, orchestrating these technologies with minimal overhead still remains a difficult task for scientists. Scientific workflow systems improve this situation by creating interfaces to a variety of technologies and automating the execution and monitoring of the workflows. Workflow systems provide domain-independent customizable interfaces and tools that combine different tools and technologies along with efficient methods for using them. As simulations and experiments move into the petascale regime, the orchestration of long running data and compute intensive tasks is becoming a major requirement for the successful steering and completion of scientific investigations.

A scientific workflow is the process of combining data and processes into a configurable, structured set of steps that implement semi-automated computational solutions of a scientific problem. Kepler is a cross-project collaboration, co-founded by the SciDAC Scientific Data Management (SDM) Center, whose purpose is to develop a domain-independent scientific workflow system. It provides a workflow environment in which scientists design and execute scientific workflows by specifying the desired sequence of computational actions and the appropriate data flow, including required data transformations, between these steps. Currently deployed workflows range from local analytical pipelines to distributed, high-performance and high-throughput applications, which can be both data- and compute-intensive. The scientific workflow approach offers a number of advantages over traditional scripting-based approaches, including ease of configuration, improved reusability and maintenance of workflows and components (called actors), automated provenance management, ''smart'' re-running of different versions of workflow instances, on-the-fly updateable parameters, monitoring of long running tasks, and support for fault-tolerance and recovery from failures.

We present an overview of common scientific workflow requirements and their associated features which are lacking in current state-of-the-art workflow management systems. We then illustrate features of the Kepler workflow system, both from a user's and a ''workflow engineer's'' point-of-view. In particular, we highlight the use of some of the current features of Kepler in several scientific applications, as well as upcoming extensions and improvements that are geared specifically for SciDAC user communities.

479
The following article is Open access

, , , , , and

Computational chemists are using Common Component Architecture (CCA) technology to increase the parallel scalability of their application ten-fold. Combustion researchers are publishing science faster because the CCA manages software complexity for them. Both the solver and meshing communities in SciDAC are converging on community interface standards as a direct response to the novel level of interoperability that CCA presents. Yet, there is much more to do before component technology becomes mainstream computational science. This paper highlights the impact that the CCA has made on scientific applications, conveys some lessons learned from five years of the SciDAC program, and previews where applications could go with the additional capabilities that the CCA has planned for SciDAC 2.

494
The following article is Open access

and

This article describes the motivation, design and implementation of Berkeley Lab Checkpoint/Restart (BLCR), a system-level checkpoint/restart implementation for Linux clusters that targets the space of typical High Performance Computing applications, including MPI. Application-level solutions, including both checkpointing and fault-tolerant algorithms, are recognized as more time and space efficient than system-level checkpoints, which cannot make use of any application-specific knowledge. However, system-level checkpointing allows for preemption, making it suitable for responding to ''fault precursors'' (for instance, elevated error rates from ECC memory or network CRCs, or elevated temperature from sensors). Preemption can also increase the efficiency of batch scheduling; for instance reducing idle cycles (by allowing for shutdown without any queue draining period or reallocation of resources to eliminate idle nodes when better fitting jobs are queued), and reducing the average queued time (by limiting large jobs to running during off-peak hours, without the need to limit the length of such jobs). Each of these potential uses makes BLCR a valuable tool for efficient resource management in Linux clusters.

500
The following article is Open access

The data from Scientific simulations, observations, and experiments are now being measured in terabytes and will soon reach the petabyte regime. The size of the data, as well as its complexity, make it difficult to find useful information in the data. This is of course disconcerting to scientists who wonder about the science still undiscovered in the data. The Sapphire Scientific data mining project is addressing this concern by applying data mining techniques to problems ranging in size from a few megabytes to a hundred terabytes in a variety of domains. In this paper, we briefly describe our work in several applications, including the identification of key features for edge harmonic oscillations in the DIII-D tokamak, classification of orbits in a Poincaré plot, and tracking of features of interest in experimental images.

505
The following article is Open access

, , , , , , , , , et al

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem – the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines.

510
The following article is Open access

, , , , , , , , , et al

With support from the U.S. Department of Energy's Scientific Discover Through Advanced Computing (SciDAC) program, we have developed and deployed the Earth System Grid (ESG) to make climate simulation data easily accessible to the global climate modelling and analysis community. ESG currently has 2500 registered users and manages 160 TB of data in archives distributed around the nation. From this past year alone, more than 200 scientific journal articles have been published from analyses of data delivered by the ESG.

515
The following article is Open access

, , , , , , , and

The Department of Energy's (DOE) Office of Science is the largest supporter of basic research in the physical sciences in the US. It directly supports the research of 15,000 PhDs, PostDocs and Graduate Students, and operates major scientific facilities at DOE laboratories that serve the entire US research community: other Federal agencies, universities, and industry, as well as the international research and education (R and E) community. ESnet's mission is to provide the network infrastructure that supports the mission of the Office of Science (SC). ESnet must evolve substantially in order to continue meeting the Office of Science mission needs and this paper discusses the development of ESnet's strategy to meet these requirements through a new network architecture and implementation approach.

521
The following article is Open access

, , , , , and

The Globus Toolkit Monitoring and Discovery System (MDS4) defines and implements mechanisms for service and resource discovery and monitoring in distributed environments. MDS4 is distinguished from previous similar systems by its extensive use of interfaces and behaviors defined in the WS-Resource Framework and WS-Notification specifications, and by its deep integration into essentially every component of the Globus Toolkit. We describe the MDS4 architecture and the Web service interfaces and behaviors that allow users to discover resources and services, monitor resource and service states, receive updates on current status, and visualize monitoring results. We present two current deployments to provide insights into the functionality that can be achieved via the use of these mechanisms.

VISUALIZATION

526
The following article is Open access

, , , , , , and

Cosmological simulations follow the formation of nonlinear structure in dark and luminous matter. The associated simulation volumes and dynamic range are very large, making visualization both a necessary and challenging aspect of the analysis of these datasets. Our goal is to understand sources of inconsistency between different simulation codes that are started from the same initial conditions. Quantitative visualization supports the definition and reasoning about analytically defined features of interest. Comparative visualization supports the ability to visually study, side by side, multiple related visualizations of these simulations. For instance, a scientist can visually distinguish that there are fewer halos (localized lumps of tracer particles) in low-density regions for one simulation code out of a collection. This qualitative result will enable the scientist to develop a hypothesis, such as loss of halos in low-density regions due to limited resolution, to explain the inconsistency between the different simulations. Quantitative support then allows one to confirm or reject the hypothesis. If the hypothesis is rejected, this step may lead to new insights and a new hypothesis, not available from the purely qualitative analysis. We will present methods to significantly improve the Scientific analysis process by incorporating quantitative analysis as the driver for visualization. Aspects of this work are included as part of two visualization tools, ParaView, an open-source large data visualization tool, and Scout, an analysis-language based, hardware-accelerated visualization tool.

535
The following article is Open access

New challenges for scientists have emerged in the past several years as the size of data generated from simulations has experienced an exponential growth. One major factor that is contributing to the growth of data size is the increasingly widespread ability to perform very large scale time-varying simulations. To analyze complex dynamic phenomena from a timevarying data set, it is necessary to navigate and browse the data in both the spatial and temporal domains, select data at different resolutions, experiment with different visualization parameters, and compute and animate selected features over a period of time. In this paper, we present several algorithms for visualizing large scale time-varying Scientific data including: (1) Lossless spatio-temporal data encoding and indexing schemes allowing for interactive visualization of time-varying data at arbitrary spatial and temporal scales. (2) Coherence based accelerated visulaization algorithms (3) Time-varying feature enhancement and tracking algorithms. Our goal is to minimize visualization computation cost, to minimize data transfer (network transmit and disk I/O) cost, to maximize the user's ability to interrogate time-varying data in different spatial and temporal scales, and to detect and track important time-varying features.

545
The following article is Open access

, , , , , , and

This paper describes recent work on securing a Web-browser-based remote visualization capability for large datasets. The results from a security performance study are presented.

550
The following article is Open access

, , , , and

Computational astrophysics and climate dynamics are two principal application foci at the Center for Computational Sciences (CCS) at Oak Ridge National Laboratory (ORNL). We identify a dataset frontier that is shared by several SciDAC computational science domains and present an exploration of traditional production visualization techniques enhanced with new enabling research technologies such as advanced parallel occlusion culling and high resolution small multiples statistical analysis. In collaboration with our research partners, these techniques will allow the visual exploration of a new generation of peta-scale datasets that cross this data frontier along all axes.

556
The following article is Open access

, , , , , and

When a heavy fluid is placed above a light fluid, tiny vertical perturbations in the interface create a characteristic structure of rising bubbles and falling spikes known as Rayleigh-Taylor instability. Rayleigh-Taylor instabilities have received much attention over the past half-century because of their importance in understanding many natural and man-made phenomena, ranging from the rate of formation of heavy elements in supernovae to the design of capsules for Inertial Confinement Fusion. We present a new approach to analyze Rayleigh-Taylor instabilities in which we extract a hierarchical segmentation of the mixing envelope surface to identify bubbles and analyze analogous segmentations of fields on the original interface plane. We compute meaningful statistical information that reveals the evolution of topological features and corroborates the observations made by scientists.

561
The following article is Open access

, , , , , , , , , et al

This project focuses on leveraging scientific visualization and analytics software technology as an enabling technology for increasing scientific productivity and insight. Advances in computational technology have resulted in an ''information big bang, '' which in turn has created a significant data understanding challenge. This challenge is widely acknowledged to be one of the primary bottlenecks in contemporary science. The vision for our Center is to respond directly to that challenge by adapting, extending, creating when necessary and deploying visualization and data understanding technologies for our science stakeholders. Using an organizational model as a Visualization and Analytics Center for Enabling Technologies (VACET), we are well positioned to be responsive to the needs of a diverse set of scientific stakeholders in a coordinated fashion using a range of visualization, mathematics, statistics, computer and computational science and data management technologies.

570
The following article is Open access

, , , and

Over the years, homogeneous computer cluster have been the most popular, and, in some sense, the only viable, platform for use in parallel visualization. In this work, we designed an execution environment for data-intensive visualization that is suitable to handle SciDAC scale datasets. This environment is solely based on computers distributed across the Internet that are owned and operated by independent institutions, while being openly shared for free. Those Internet computers are inherently of heterogeneous hardware configuration and running a variety of operating systems. Using 100 processors of such kind, we have been able to obtain the same level of performance offered by a 64-node cluster of 2.2 GHz P4 processors, while processing a 75GBs subset of TSI simulation data. Due to its inherently shared nature, this execution environment for data-intensive visualization could provide a viable means of collaboration among geographically separated SciDAC scientists.