Table of contents

Volume 16

2005

Previous issue Next issue

SciDAC 2005, SCIENTIFIC DISCOVERY THROUGH ADVANCED COMPUTING 26–30 June 2005, San Francisco, CA, USA

Published online: 31 August 2005

PREFACE

E01
The following article is Open access

On 26–30 June 2005 at the Grand Hyatt on Union Square in San Francisco several hundred computational scientists from around the world came together for what can certainly be described as a celebration of computational science. Scientists from the SciDAC Program and scientists from other agencies and nations were joined by applied mathematicians and computer scientists to highlight the many successes in the past year where computation has led to scientific discovery in a variety of fields: lattice quantum chromodynamics, accelerator modeling, chemistry, biology, materials science, Earth and climate science, astrophysics, and combustion and fusion energy science. Also highlighted were the advances in numerical methods and computer science, and the multidisciplinary collaboration cutting across science, mathematics, and computer science that enabled these discoveries.

The SciDAC Program was conceived and funded by the US Department of Energy Office of Science. It is the Office of Science's premier computational science program founded on what is arguably the perfect formula: the priority and focus is science and scientific discovery, with the understanding that the full arsenal of `enabling technologies' in applied mathematics and computer science must be brought to bear if we are to have any hope of attacking and ultimately solving today's computational Grand Challenge problems. The SciDAC Program has been in existence for four years, and many of the computational scientists funded by this program will tell you that the program has given them the hope of addressing their scientific problems in full realism for the very first time. Many of these scientists will also tell you that SciDAC has also fundamentally changed the way they do computational science.

We begin this volume with one of DOE's great traditions, and core missions: energy research. As we will see, computation has been seminal to the critical advances that have been made in this arena. Of course, to understand our world, whether it is to understand its very nature or to understand it so as to control it for practical application, will require explorations on all of its scales. Computational science has been no less an important tool in this arena than it has been in the arena of energy research. From explorations of quantum chromodynamics, the fundamental theory that describes how quarks make up the protons and neutrons of which we are composed, to explorations of the complex biomolecules that are the building blocks of life, to explorations of some of the most violent phenomena in our universe and of the Universe itself, computation has provided not only significant insight, but often the only means by which we have been able to explore these complex, multicomponent systems and by which we have been able to achieve scientific discovery and understanding.

While our ultimate target remains scientific discovery, it certainly can be said that at a fundamental level the world is mathematical. Equations ultimately govern the evolution of the systems of interest to us, be they physical, chemical, or biological systems. The development and choice of discretizations of these underlying equations is often a critical deciding factor in whether or not one is able to model such systems stably, faithfully, and practically, and in turn, the algorithms to solve the resultant discrete equations are the complementary, critical ingredient in the recipe to model the natural world. The use of parallel computing platforms, especially at the TeraScale, and the trend toward even larger numbers of processors, continue to present significant challenges in the development and implementation of these algorithms.

Computational scientists often speak of their `workflows'. A workflow, as the name suggests, is the sum total of all complex and interlocking tasks, from simulation set up, execution, and I/O, to visualization and scientific discovery, through which the advancement in our understanding of the natural world is realized. For the computational scientist, enabling such workflows presents myriad, signiflcant challenges, and it is computer scientists that are called upon at such times to address these challenges. Simulations are currently generating data at the staggering rate of tens of TeraBytes per simulation, over the course of days. In the next few years, these data generation rates are expected to climb exponentially to hundreds of TeraBytes per simulation, performed over the course of months. The output, management, movement, analysis, and visualization of these data will be our key to unlocking the scientific discoveries buried within the data. And there is no hope of generating such data to begin with, or of scientific discovery, without stable computing platforms and a sufficiently high and sustained performance of scientific applications codes on them.

Thus, scientific discovery in the realm of computational science at the TeraScale and beyond will occur at the intersection of science, applied mathematics, and computer science. The SciDAC Program was constructed to mirror this reality, and the pages that follow are a testament to the efficacy of such an approach.

We would like to acknowledge the individuals on whose talents and efforts the success of SciDAC 2005 was based. Special thanks go to Betsy Riley for her work on the SciDAC 2005 Web site and meeting agenda, for lining up our corporate sponsors, for coordinating all media communications, and for her efforts in processing the proceedings contributions, to Sherry Hempfling for coordinating the overall SciDAC 2005 meeting planning, for handling a significant share of its associated communications, and for coordinating with the ORNL Conference Center and Grand Hyatt, to Angela Harris for producing many of the documents and records on which our meeting planning was based and for her efforts in coordinating with ORNL Graphics Services, to Angie Beach of the ORNL Conference Center for her efforts in procurement and setting up and executing the contracts with the hotel, and to John Bui and John Smith for their superb wireless networking and A/V set up and support. We are grateful for the relentless efforts of all of these individuals, their remarkable talents, and for the joy of working with them during this past year. They were the cornerstones of SciDAC 2005. Thanks also go to Kymba A'Hearn and Patty Boyd for on-site registration, Brittany Hagen for administrative support, Bruce Johnston for netcast support, Tim Jones for help with the proceedings and Web site, Sherry Lamb for housing and registration, Cindy Lathum for Web site design, Carolyn Peters for on-site registration, and Dami Rich for graphic design. And we would like to express our appreciation to the Oak Ridge National Laboratory, especially Jeff Nichols, the Argonne National Laboratory, the Lawrence Berkeley National Laboratory, and to our corporate sponsors, Cray, IBM, Intel, and SGI, for their support.

We would like to extend special thanks also to our plenary speakers, technical speakers, poster presenters, and panelists for all of their efforts on behalf of SciDAC 2005 and for their remarkable achievements and contributions. We would like to express our deep appreciation to Lali Chatterjee, Graham Douglas and Margaret Smith of Institute of Physics Publishing, who worked tirelessly in order to provide us with this finished volume within two months, which is nothing short of miraculous.

Finally, we wish to express our heartfelt thanks to Michael Strayer, SciDAC Director, whose vision it was to focus SciDAC 2005 on scientific discovery, around which all of the excitement we experienced revolved, and to our DOE SciDAC program managers, especially Fred Johnson, for their support, input, and help throughout.

OPENING REMARKS

E02
The following article is Open access

Good morning. Welcome to SciDAC 2005 and San Francisco.

SciDAC is all about computational science and scientific discovery. In a large sense, computational science characterizes SciDAC and its intent is change. It transforms both our approach and our understanding of science. It opens new doors and crosses traditional boundaries while seeking discovery. In terms of twentieth century methodologies, computational science may be said to be transformational.

There are a number of examples to this point. First are the sciences that encompass climate modeling. The application of computational science has in essence created the field of climate modeling. This community is now international in scope and has provided precision results that are challenging our understanding of our environment.

A second example is that of lattice quantum chromodynamics. Lattice QCD, while adding precision and insight to our fundamental understanding of strong interaction dynamics, has transformed our approach to particle and nuclear science. The individual investigator approach has evolved to teams of scientists from different disciplines working side-by-side towards a common goal.

SciDAC is also undergoing a transformation. This meeting is a prime example. Last year it was a small programmatic meeting tracking progress in SciDAC. This year, we have a major computational science meeting with a variety of disciplines and enabling technologies represented. SciDAC 2005 should position itself as a new corner stone for Computational Science and its impact on science.

As we look to the immediate future, FY2006 will bring a new cycle to SciDAC. Most of the program elements of SciDAC will be re-competed in FY2006. The re-competition will involve new instruments for computational science, new approaches for collaboration, as well as new disciplines. There will be new opportunities for virtual experiments in carbon sequestration, fusion, and nuclear power and nuclear waste, as well as collaborations with industry and virtual prototyping. New instruments of collaboration will include institutes and centers while summer schools, workshops and outreach will invite new talent and expertise.

Computational science adds new dimensions to science and its practice. Disciplines of fusion, accelerator science, and combustion are poised to blur the boundaries between pure and applied science. As we open the door into FY2006 we shall see a landscape of new scientific challenges: in biology, chemistry, materials, and astrophysics to name a few.

The enabling technologies of SciDAC have been transformational as drivers of change. Planning for major new software systems assumes a base line employing Common Component Architectures and this has become a household word for new software projects. While grid algorithms and mesh refinement software have transformed applications software, data management and visualization have transformed our understanding of science from data. The Gordon Bell prize now seems to be dominated by computational science and solvers developed by TOPS ISIC.

The priorities of the Office of Science in the Department of Energy are clear. The 20 year facilities plan is driven by new science. High performance computing is placed amongst the two highest priorities. Moore's law says that by the end of the next cycle of SciDAC we shall have peta-flop computers. The challenges of petascale computing are enormous. These and the associated computational science are the highest priorities for computing within the Office of Science. Our effort in Leadership Class computing is just a first step towards this goal.

Clearly, computational science at this scale will face enormous challenges and possibilities. Performance evaluation and prediction will be critical to unraveling the needed software technologies. We must not lose sight of our overarching goal—that of scientific discovery. Science does not stand still and the landscape of science discovery and computing holds immense promise. In this environment, I believe it is necessary to institute a system of science based performance metrics to help quantify our progress towards science goals and scientific computing.

As a final comment I would like to reaffirm that the shifting landscapes of science will force changes to our computational sciences, and leave you with the quote from Richard Hamming, 'The purpose of computing is insight, not numbers'.

KEYNOTE

E03
The following article is Open access

The goal of the `Scientific Discovery through Advanced Computing' (SciDAC) program was to build the scientific computing software and hardware infrastructure needed to realize the full potential of terascale computers for advancing scientific discovery and the state of the art in engineering. Although funding for the program was modest, it has clearly demonstrated the benefits of teaming computational scientists, computer scientists and applied mathematicians in tackling forefront scientific problems. DOE's science programs are now faced with a number of challenges, e.g., multi-scale, multi-physics problems, which require advanced modelling and simulation capabilities on petascale computers. At the same time, computing technologies are undergoing significant changes, some of which will require new approaches if the power of these new technologies are to be harnessed. A continuing commitment to SciDAC is needed.

PUSHING THE FRONTIERS OF ENERGY RESEARCH

FUSION ENERGY

1
The following article is Open access

, and

Since its introduction in the early 1980s, the gyrokinetic particle-in-cell (PIC) method has been very successfully applied to the exploration of many important kinetic stability issues in magnetically confined plasmas. Its self-consistent treatment of charged particles and the associated electromagnetic fluctuations makes this method appropriate for studying enhanced transport driven by plasma turbulence. Advances in algorithms and computer hardware have led to the development of a parallel, global, gyrokinetic code in full toroidal geometry, the gyrokinetic toroidal code (GTC), developed at the Princeton Plasma Physics Laboratory. It has proven to be an invaluable tool to study key effects of low-frequency microturbulence in fusion plasmas. As a high-performance computing applications code, its flexible mixed-model parallel algorithm has allowed GTC to scale to over a thousand processors, which is routinely used for simulations. Improvements are continuously being made. As the US ramps up its support for the International Tokamak Experimental Reactor (ITER), the need for understanding the impact of turbulent transport in burning plasma fusion devices is of utmost importance. Accordingly, the GTC code is at the forefront of the set of numerical tools being used to assess and predict the performance of ITER on critical issues such as the efficiency of energy confinement in reactors.

16
The following article is Open access

, , , , , , and

Recent progress in gyrokinetic particle-in-cell simulations of turbulent plasmas using the gyrokinetic toroidal code (GTC) is surveyed. In particular, recent results for electron temperature gradient (ETG) modes and their resulting transport are presented. Also, turbulence spreading, and the effects of the parallel nonlinearity, are described. The GTC code has also been generalized for non-circular plasma cross-section, and initial results are presented. In addition, two distinct methods of generalizing the GTC code to be electromagnetic are described, along with preliminary results. Finally, a related code, GTC-Neo, for calculating neoclassical fluxes, electric fields, and velocities, are described.

25
The following article is Open access

, , , , , , , , , et al

Peak performance for magnetically confined fusion plasmas occurs near thresholds of instability for macroscopic modes that distort and possibly disrupt equilibrium conditions. In some cases, however, the best approach is to exceed stability thresholds and rely upon nonlinear saturation effects. Understanding this behaviour is essential for achieving ignition in future burning plasma experiments, and advances in large-scale numerical simulation have an important role. High-order finite elements permit accurate representation of extreme anisotropies associated with the magnetic field, and a new implicit algorithm has been developed for advancing the two-fluid model. Implementation of parallel direct methods for solving the sparse matrices makes the approach practical. The resulting performance improvements are presently being applied to investigate the evolution of 'edge localized modes' including important drift effects.

35
The following article is Open access

, , , , , , , , , et al

In a magnetized plasma, such as in fusion devices or the Earth's magnetosphere, several different kinds of waves can simultaneously exist, having very different physical properties. Under the right conditions one wave can quite suddenly convert to another type. Depending on the case, this can be either a great benefit or a problem for the use of waves to heat and control fusion plasmas. Understanding and accurately modeling such behavior is a major computational challenge.

40
The following article is Open access

, , , and

We present a cell-centered semi-implicit algorithm for solving the equations of single fluid resistive MHD for block structured adaptive meshes. The unsplit method [1] is extended for the ideal MHD part, and the diffusive terms are solved implicitly. The resulting second-order accurate scheme is conservative while preserving the ∇ · B = 0 constraint. Numerical results from a variety of verification tests are presented.

49
The following article is Open access

and

The National Fusion Collaboratory Project is developing a persistent infrastructure to enable scientific collaboration for all aspects of magnetic fusion energy research by creating a robust, user-friendly collaborative environment and deploying this to the more than one thousand fusion scientists in forty institutions who perform magnetic fusion research in the US. Work specifically focusing on advancing real-time interpretation of fusion experiments includes collocated collaboration in tokamak control rooms via shared display walls, remote collaboration using Internet based audio and video, and pseudo-real-time data analysis via the National Fusion Energy Grid (FusionGrid). The technologies being developed and deployed will also scale to the next generation experimental devices such as ITER.

54
The following article is Open access

, , , , , and

Extended MHD effects are important in the nonlinear behavior of magnetically confined plasmas, even at the simplest level represented by the self-consistent diamagnetic (B ×p) drifts. Allowing the electrons and ions to move independently, even as fluids, breaks certain geometrical symmetries preserved by the MHD equations that can be important for toroidal fusion burning plasmas. These symmetries are also broken by certain experimental designs and high temperature plasma conditions. Results are shown from the two-fluid and hybrid particle/fluid models in the M3D MPP code, part of the SciDAC CEMM (Center for Extended Magnetohydrodynamic Modeling) project.

59
The following article is Open access

, , , , , , , and

A general geometry model has been developed for the gyrokinetic toroidal code GTC with a number of highly desirable features including a systematic treatment of plasma rotation and equilibrium E × B flow, realistic plasma profiles and corresponding magnetohydrodanamic (MHD) equilibria. A symmetry coordinate system is used to construct a relatively regular mesh in real space for strongly shaped toroidal plasmas, which also facilitates straightforward visualization. By rescaling the radial coordinate, grid size is correlated with the local gyroradius which may vary substantially from the core to the edge. Gyrokinetic transformation of potential and charge density between particle and guiding center positions in general geometry is carefully treated, taking into account the finite ratio of the poloidal to the total field (Bθ/B). The applied equilibrium E × B flow, which is believed to play an important role in determining the turbulence level, is calculated from our global neoclassical particle code GTC-Neo. In the large aspect ratio circular geometry limit, cross benchmarks with the original GTC code show good agreement in real frequency, growth rate, steady-state heat flux and zonal flow amplitude for the ion temperature gradient driven microinstabilities (ITG modes).

COMBUSTION ENERGY SCIENCE

65
The following article is Open access

, , and

The advancement of our basic understanding of turbulent combustion processes and the development of physics-based predictive tools for design and optimization of the next generation of combustion devices are strategic areas of research for the development of a secure, environmentally sound energy infrastructure. In direct numerical simulation (DNS) approaches, all scales of the reacting flow problem are resolved. However, because of the magnitude of this task, DNS of practical high Reynolds number turbulent hydrocarbon flames is out of reach of even terascale computing. For the foreseeable future, the approach to this complex multi-scale problem is to employ distinct but synergistic approaches to tackle smaller sub-ranges of the complete problem, which then require models for the small scale interactions. With full access to the spatially and temporally resolved fields, DNS can play a major role in the development of these models and in the development of fundamental understanding of the micro-physics of turbulence-chemistry interactions. Two examples, from simulations performed at terascale Office of Science computing facilities, are presented to illustrate the role of DNS in delivering new insights to advance the predictive capability of models. Results are presented from new three-dimensional DNS with detailed chemistry of turbulent non-premixed jet flames, revealing the differences between mixing of passive and reacting scalars, and determining an optimal lower dimensional representation of the full thermochemical state space.

80
The following article is Open access

, , , and

We have entered a new era in turbulent combustion calculations, where we can now simulate a detailed laboratory-scale turbulent reacting flow with sufficient fidelity that the computed data may be expected to agree with experimental measurements. Moreover, flame simulations can be used to help interpret measured diagnostics, validate evolving flame theories, and generally allow exploration of the system in ways not previously available to experimentalists. In this paper, we will discuss our adaptive projection algorithm for low speed reacting flow that has helped make these types of simulations feasible, and two sets of new issues that are associated with application of this approach to simulating real flames. Using a recently computed flame simulation as an example, we will discuss issues concerning characterization of the experimental conditions and validation of the computed results. We also discuss recent developments in the analysis and interpretation of extremely large and complex reacting flow datasets, and a new approach to simulating premixed turbulent flames relevant to laboratoryscale combustion experiments-a feedback-controlled flame stabilization method.

91
The following article is Open access

, , and

This paper provides an overview of recent progress in our development of highfidelity simulation of turbulent combustion with detailed chemistry. In particular, two major accomplishments are presented and discussed: (a) As for the computational aspects, it was recognized that many existing techniques to treat inflow and outflow boundary conditions for compressible flow simulations suffered from spurious errors when applied to highly turbulent reacting flow problems. Upon careful examination, the sources of these problems have been identified and an improved characteristic boundary condition strategy has been developed. The new method has been applied to various test problems, thereby demonstrating that the improved boundary conditions can successfully reproduce complex combustion events in a finite domain size with desired accuracy and stability. (b) As a science application, more advanced physical models for soot formation and radiative heat transfer have been developed in order to provide fundamental understanding of the interaction among turbulence, chemistry and radiation. We have performed several parametric simulations of two-dimensional ethyleneair nonpremixed counterflow flames interacting with counter-rotating vortex pairs and injected turbulent flows to investigate transient dynamics of soot formation process. Detailed analysis on the transient characteristics of soot behavior is discussed.

101
The following article is Open access

, , , and

We briefly review various chemical model reduction strategies with application in reacting flow computations. We focus on systematic methods that enable automated model reduction. We highlight the specific advantages of computational singular perturbation (CSP) analysis. We outline a novel implementation of CSP, with adaptive tabulation of the basis vectors, that enables fast identification of the reduced chemical model at any point in the chemical phase space, and efficient integration of the chemical system. We describe this implementation in the context of a particular model problem that exhibits stiffness typical of chemical kinetic systems.

107
The following article is Open access

, , , , , , , and

New approaches are being explored to facilitate multidisciplinary collaborative research of Homogenous Charge Compression Ignition (HCCI) combustion processes. In this paper, collaborative sharing of the Range Identification and Optimization Toolkit (RIOT) and related data and models is discussed. RIOT is a developmental approach to reduce the computational complexity of detailed chemical kinetic mechanisms, enabling their use in modeling kinetically controlled combustion applications such as HCCI. These approaches are being developed and piloted as a part of the Collaboratory for Multiscale Chemical Sciences (CMCS) project. The capabilities of the RIOT code are shared through a portlet in the CMCS portal that allows easy specification and processing of RIOT inputs, remote execution of RIOT, tracking of data pedigree, and translation of RIOT outputs to a table view and to a commonly-used chemical model format.

113
The following article is Open access

, , and

Block-structured adaptively refined meshes are an efficient means of discretizing a domain characterized by a large spectrum of spatiotemporal scales. Further, they allow the use of simple data structures (multidimensional arrays) which considerably assist the task of using them in conjunction with sophisticated numerical algorithms. In this work, we show how such meshes may be used with high order (i.e. greater than 2nd order) discretization to achieve greater accuracies at significantly less computational expense, as compared to conventional second order approaches. Our study explores how these high order discretizations are coupled with high-order interpolations and filters to achieve high order convergence on such meshes. One of the side-effects of using high order discretizations is that one now obtains shallow grid hierarchies, which are easier to load balance. As a part of this work, we introduce the concept of bi-level (grid) partitioning and motivate, via an analytical model, how it holds the potential to significantly reduce load-imbalances while incurring a minimal communication cost.

119
The following article is Open access

and

The overall objective of this paper is to illustrate how detailed numerical modelling may be used to bring basic information on fundamental problems in combustion science. We consider in the following the interaction of non-premixed flames with cold solid wall surfaces. Flame-wall interactions are an important feature in many combustion systems, that result in significant changes in the flame and wall dynamics: the flame strength is reduced near cold wall surfaces, leading possibly to (partial or total) quenching, while the gassolid heat flux takes peak values at flame contact. The questions of turbulent fuel-air-temperature mixing, flame extinction and wall surface heat transfer are here studied using direct numerical simulation (DNS). The DNS configuration corresponds to an ethylene-air diffusion flame stabilized in the near-wall region of a chemically-inert solid surface. Simulations are performed with adiabatic or isothermal wall boundary conditions, and with different turbulence intensities. The simulations feature flame extinction events resulting from excessive wall cooling, and convective heat transfer up to 90 kW/m2. The structure of the simulated wall flames is studied in terms of a classical mass mixing variable, i.e. the fuel-airbased mixture fraction, and a less familiar heat loss variable, i.e. the excess enthalpy variable, introduced to provide a measure of non-adiabatic behavior due to wall cooling.

124
The following article is Open access

and

Direct numerical simulations (DNS) are used to investigate the ignition of n-heptane fuel spray under high pressure and lean conditions. For the solution of the carrier gas fluid, the Eulerian method is employed, while for the fuel spray, the Lagrangian method is used. A chemistry mechanism for n-heptane with 33 species and 64 reactions is adopted to describe the chemical reactions. Initial carrier gas temperature and pressure are 926 K and 30.56 atmospheres, respectively. Initial global equivalence ratio is 0.258. Two cases with droplet radiuses of 35.5 and 20.0 macrons are simulated. Evolutions of the carrier gas temperature and species mass fractions are presented. Contours of the carrier gas temperature and species mass fractions near ignition and after ignition are presented. The results show that the smaller fuel droplet case ignites earlier than the larger droplet case. For the larger droplet case, ignition occurs first at one location; for the smaller droplet case, however, ignition occurs first at multiple locations. At ignition kernels, significant NO is produced when temperature is high enough at the ignition kernels. For the larger droplet case, more NO is produced than the smaller droplet case due to the inhomogeneous distribution and incomplete mixing of fuel vapor.

SCIENTIFIC DISCOVERY AT ALL SCALES

LATTICE QCD

129
The following article is Open access

, , , , , , , , , et al

After a brief overview of quantum chromodynamics (QCD), the fundamental theory of the strong interactions, we describe the QCDOC computer, its architecture, construction, software and performance. Three 12K-node, 4 Teraflops (sustained) QCDOC computers have been constructed, two at the Brookhaven National Lab and one at the University of Edinburgh. The present status of these machines and their first physics results and objectives are discussed and the catalytic role of the SciDAC program in enabling the effective use of this new architecture by the US lattice QCD community outlined.

140
The following article is Open access

The fundamental laws of nature as we now know them are governed the fundamental parameters of the Standard Model. Some of these, such as the masses of the quarks, have been hidden from direct observation by the confinement of quarks. They are now being revealed through large scale numerical simulation of lattice gauge theory.

150
The following article is Open access

, , , , , , , , , et al

The structure of neutrons, protons, and other strongly interacting particles in terms of their quark and gluon constituents can be calculated from first principles by solving QCD on a discrete space-time lattice. With the advent of SciDAC software and prototype clusters and of DOE supported dedicated lattice QCD computers, it is now possible to calculate physical observables using full QCD in the regime of large lattice volumes and light quark masses that can be compared with experiment. This talk will describe selected examples, including the nucleon axial charge, structure functions, electromagnetic form factors, the origin of the nucleon spin, the transverse structure of the nucleon, and the nucleon to Delta transition form factor.

160
The following article is Open access

, , , , , , , , , et al

By numerical study of the simple bound states of light quarks, in particular the π and K mesons, we are able to deduce fundamental quark properties. Using the "improved staggered" discretization of QCD, the MILC Collaboration has performed a series of simulations of these bound states, including the effects of virtual quark-antiquark pairs ("sea" quarks). From these simulations, we have determined the masses of the up, down, and strange quarks. We find that the up quark mass is not zero (at the 10 sigma level), putting to rest a twenty-year-old suggestion that the up quark could be massless. Further, by studying the decays of the π and K mesons, we are able to determine the "CKM matrix element" Vus of the Weak Interactions. The errors on our result for Vus are comparable to the best previous determinations using alternative theoretical approaches, and are likely to be significantly reduced by simulations now in progress.

165
The following article is Open access

The current status of the SciDAC software infrastructure project for lattice gauge theory is summarized. This includes the the design of a QCD application programmers interface (API) that allows existing and future codes to be run efficiently on Terascale hardware facilities and to be rapidly ported to new dedicated or commercial platforms. The critical components of the API have been implemented and are in use on the US QCDOC hardware at BNL and on both the switched and mesh architecture Pentium 4 clusters at Fermi National Accelerator Laboratory (FNAL) and Thomas Jefferson National Accelerator Facility (JLab). Future software infrastructure requirements and research directions are also discussed.

169
The following article is Open access

In this contribution I am going to summarize recent progress in exploring the phase diagram as well as the properties of strongly interacting matter at extreme conditions using lattice QCD. Current status and prospectives of calculating quantities crucial for RHIC program (quarkonia spectral functions, critical energy density, equation of state) is discussed in detail. The role of the SciDAC program in reaching the outlined goals is emphasized.

174
The following article is Open access

, , , , , , , , , et al

Lattice QCD is an essential complement to the current and anticipated DOE-supported experimental program in hadronic physics. In this paper we address several key questions central to our understanding of the building blocks of nuclear matter, nucleons and pions. Firstly, we describe progress at computing the electromagnetic form factors of the nucleon, describing the distribution of charge and current, before considering the role played by the strange quarks. We then describe the study of transition form factors to the Delta resonance. Finally, we present recent work to determine the pion form factor, complementary to the current JLab experimental determination and provding insight into the approach to asymptotic freedom.

179
The following article is Open access

and

Our present theory for the elemental particles in nature, the Standard Model, consists of 6 leptons and 6 quarks, plus the 4 bosons which mediate the electromagnetic, weak, and strong forces. The theory has several free parameters which must be constrained by experiment before it is entirely predictive. In Nature quarks never appear alone; only bound states of strongly coupled valence quarks (and/or anti-quarks) are detected. Consequently, the parameters governing quark flavor mixing are difficult to constrain by experiment, which measures properties of the bound states. Numerical simulations are needed to connect the theory of how quarks and gluons interact, quantum chromodynamics (formulated on a spacetime lattice), to the physically observed properties. Recent theory innovations and computer developments have allowed us finally to do lattice QCD simulations with realistic parameters. This paper describes the exciting progress using lattice QCD simulations to determine fundamental parameters of the Standard Model.

ACCELERATOR DESIGN

184
The following article is Open access

, , , , , , , , , et al

Advanced accelerator research is aimed at finding new technologies that can dramatically reduce the size and cost of future high-energy accelerators. Supercomputing is already playing a dramatic and critical role in this quest. One of the goals of the SciDAC Accelerator Modeling Project is to develop code and software that can ultimately be used to discover the underlying science of new accelerator technology and then be used to design future high-energy accelerators with a minimum amount of capital expenditure on large-scale experiments. We describe the existing hierarchy of software tools for modelling advanced accelerators, how these models have been validated against experiment, how the models are benchmarked against each other, and how these tools are being successfully used to elucidate the underlying science.

195
The following article is Open access

, , , , , , , , , et al

Electromagnetic Modelling led by SLAC is a principal component of the "Advanced Computing for 21st Century Accelerator Science and Technology" SciDAC project funded through the Office of High Energy Physics. This large team effort comprises three other national laboratories (LBNL, LLNL, SNL) and six universities (CMU, Columbia, RPI, Stanford, UC Davis and U of Wisconsin) with the goal to develop a set of parallel electromagnetic codes based on unstructured grids to target challenging problems in accelerators, and solve them to unprecedented realism and accuracy. Essential to the code development are the collaborations with the ISICs/SAPP in eigensolvers, meshing, adaptive refinement, shape optimization and visualization (see "Achievements in ISICs/SAPP Collaborations for Electromagnetic Modelling of Accelerators"). Supported by these advances in computational science, we have successfully performed the large-scale simulations that have impacted important accelerator projects across the Office of Science (SC) including the Positron Electron Project (PEP) -II, Next Linear Collider (NLC) and the International Linear Collider (ILC) in High Energy Physics (HEP), the Rare Isotope Accelerator (RIA) in Nuclear Physics (NP) and the Linac Coherent Light Source (LCLS) in Basic Energy Science (BES).

205
The following article is Open access

, , , , , and

SciDAC provides the unique opportunity and the resources for the Electromagnetic System Simulations (ESS) component of High Energy Physics (HEP)'s Accelerator Science and Technology (AST) project to work with researchers in the Integrated Software Infrastructure Centres (ISICs) and Scientific Application Pilot Program (SAPP) to overcome challenging barriers in computer science and applied mathematics in order to perform the large-scale simulations required to support the ongoing R&D efforts on accelerators across the Office of Science. This paper presents the resultant achievements made under SciDAC in important areas of computational science relevant to electromagnetic modelling of accelerators which include nonlinear eigensolvers, shape optimization, adaptive mesh refinement, parallel meshing, and visualization.

210
The following article is Open access

, , , , , , , , , et al

SciDAC has had a major impact on computational beam dynamics and the design of particle accelerators. Particle accelerators—which account for half of the facilities in the DOE Office of Science Facilities for the Future of Science 20 Year Outlook—are crucial for US scientific, industrial, and economic competitiveness. Thanks to SciDAC, accelerator design calculations that were once thought impossible are now carried routinely, and new challenging and important calculations are within reach. SciDAC accelerator modeling codes are being used to get the most science out of existing facilities, to produce optimal designs for future facilities, and to explore advanced accelerator concepts that may hold the key to qualitatively new ways of accelerating charged particle beams. In this paper we present highlights from the SciDAC Accelerator Science and Technology (AST) project Beam Dynamics focus area in regard to algorithm development, software development, and applications.

215
The following article is Open access

and

High precision modeling of space-charge effects is essential for designing future accelerators as well as optimizing the performance of existing machines. Synergia is a high- fidelity parallel beam dynamics simulation package with fully three dimensional space-charge capabilities and a higher-order optics implementation. We describe the Synergia framework, developed under the auspices of the DOE SciDAC program, and present Synergia simulations of the Fermilab Booster accelerator and comparisons with experiment. Our studies include investigation of coherent and incoherent tune shifts and halo formation.

CHEMISTRY

220
The following article is Open access

and

In the past thirty years, the use of scientific computing has become pervasive in all disciplines: collection and interpretation of most experimental data is carried out using computers, and physical models in computable form, with various degrees of complexity and sophistication, are utilized in all fields of science. However, full prediction of physical and chemical phenomena based on the basic laws of Nature, using computer simulations, is a revolution still in the making, and it involves some formidable theoretical and computational challenges. We illustrate the progress and successes obtained in recent years in predicting fundamental properties of materials in condensed phases and at the nanoscale, using ab-initio, quantum simulations. We also discuss open issues related to the validation of the approximate, first principles theories used in large scale simulations, and the resulting complex interplay between computation and experiment. Finally, we describe some applications, with focus on nanostructures and liquids, both at ambient and under extreme conditions.

233
The following article is Open access

, , and

A short review is given of newly developed fast electronic structure methods that are designed to treat molecular systems with strong electron correlations, such as diradicaloid molecules, for which standard electronic structure methods such as density functional theory are inadequate. These new local correlation methods are based on coupled cluster theory within a perfect pairing active space, containing either a linear or quadratic number of pair correlation amplitudes, to yield the perfect pairing (PP) and imperfect pairing (IP) models. This reduces the scaling of the coupled cluster iterations to no worse than cubic, relative to the sixth power dependence of the usual (untruncated) coupled cluster doubles model. A second order perturbation correction, PP(2), to treat the neglected (weaker) correlations is formulated for the PP model. To ensure minimal prefactors, in addition to favorable size-scaling, highly efficient implementations of PP, IP and PP(2) have been completed, using auxiliary basis expansions. This yields speedups of almost an order of magnitude over the best alternatives using 4-center 2-electron integrals. A short discussion of the scope of accessible chemical applications is given.

243
The following article is Open access

, , , , , and

Multiresolution techniques in multiwavelet bases, made practical in three and higher dimensions by separated representations, have enabled significant advances in the accuracy and manner of computation of molecular electronic structure. The mathematical and numerical techniques are described in the article by Fann. This paper summarizes the major accomplishments in computational chemistry which represent the first substantial application of most of these new ideas in three and higher dimensions. These include basis set limit computation with linear scaling for Hartree-Fock and Density Functional Theory with a wide variety of functionals including hybrid and asymptotically corrected forms. Current capabilities include energies, analytic derivatives, and excitation energies from linear response theory. Direct solution in 6-D of the two-particle wave equation has also been demonstrated. These methods are written using MADNESS which provides a high level of composition using functions and operators with guarantees of speed and precision.

247
The following article is Open access

, , , and

Recent results are presented for the efficient computation of matrix eigenvalue and eigenvector equations, the fitting of discrete sets of energy points using interpolating moving least squares methods, and time-dependent and time-independent calculations of molecular dynamics and cumulative reaction probabilities.

252
The following article is Open access

, and

The Interpolative Moving Least Squares (IMLS) fitting scheme is being developed for the purpose of fitting potential energy surfaces used in chemistry. IMLS allows for automatic surface generation in which the fitting method selects the positions at which expensive electronic structure calculations determine specific values on the surface. The resulting surfaces are necessary for accurate kinetics and dynamics.

MATERIALS SCIENCE

257
The following article is Open access

, , , and

The Cray X1 in the Center for Computational Sciences at Oak Ridge National Laboratory as well as algorithmic improvements over the past decade enable significant new science in the simulation of high-temperature "cuprate" superconductors. We describe the method of dynamic cluster approximation with quantum Monte Carlo, along with its computational requirements. We then show the unique capabilities of the X1 for supporting this method and delivering near optimal performance. This allows us to study systematically the cluster size dependence of the superconductivity in the conventional two-dimensional Hubbard model, which is commonly believed to describe high-temperature superconductors. Due to the non-locality of the d-wave superconducting order parameter, the results on small clusters show large size and geometry effects. In large enough clusters, converged results are found that display a finite temperature instability to d-wave superconductivity. The results we report here demonstrate for the first time that superconductivity is possible in a system of strongly correlated electrons without the need of a phonon mediated attractive interaction.

269
The following article is Open access

and

Molecular electronics, in which single organic molecules are designed to perform the functions of transistors, diodes, switches and other circuit elements used in current siliconbased microelecronics, is drawing wide interest as a potential replacement technology for conventional silicon-based lithographically etched microelectronic devices. In addition to their nanoscopic scale, the additional advantage of molecular electronics devices compared to silicon-based lithographically etched devices is the promise of being able to produce them cheaply on an industrial scale using wet chemistry methods (i.e., self-assembly from solution). The design of molecular electronics devices, and the processes to make them on an industrial scale, will require a thorough theoretical understanding of the molecular and higher level processes involved. Hence, the development of modeling techniques for molecular electronics devices is a high priority from both a basic science point of view (to understand the experimental studies in this field) and from an applied nanotechnology (manufacturing) point of view. Modeling molecular electronics devices requires computational methods at all length scales – electronic structure methods for calculating electron transport through organic molecules bonded to inorganic surfaces, molecular simulation methods for determining the structure of self-assembled films of organic molecules on inorganic surfaces, mesoscale methods to understand and predict the formation of mesoscale patterns on surfaces (including interconnect architecture), and macroscopic scale methods (including finite element methods) for simulating the behavior of molecular electronic circuit elements in a larger integrated device. Here we describe a large Department of Energy project involving six universities and one national laboratory aimed at developing integrated multiscale methods for modeling molecular electronics devices. The project is funded equally by the Office of Basic Energy Sciences and the Office of Advanced Scientific Computing Research, both within the Office of Science of the Department of Energy.

273
The following article is Open access

A brief description of the present position of computational materials sciences is presented. Dramatic increases in computing capability together with exciting new scientific frontiers have created unprecedented opportunities for the development and application of computational materials sciences. The balanced growth of the field involves a range of research and styles, from innovative single principal investigators to large interdisciplinary teams able to exploit leadership class computing.

277
The following article is Open access

, , , , , , , , and

Researchers at the National Renewable Energy Laboratory and their collaborators have developed over the past ∼10 years a set of algorithms for an atomistic description of the electronic structure of nanostructures, based on plane-wave pseudopotentials and configurationinteraction. The present contribution describes the first step in assembling these various codes into a single, portable, integrated set of software packages. This package is part of an ongoing research project in the development stage. Components of NanoPSE include codes for atomistic nanostructure generation and passivation, valence force field model for atomic relaxation, code for potential field generation, empirical pseudopotential method solver, strained linear combination of bulk bands method solver, configuration interaction solver for excited states, selection of linear algebra methods, and several inverse band structure solvers. Although not available for general distribution at this time as it is being developed and tested, the design goal of the NanoPSE software is to provide a software context for collaboration. The software package is enabled by fcdev, an integrated collection of best practice GNU software for open source development and distribution augmented to better support FORTRAN.

283
The following article is Open access

, , and

Large-scale quantum electronic structure calculations coupled with nonequilibrium Green function theory are employed for determining quantum conductance on practical length scales. The combination of state-of-the-art quantum mechanical methods, efficient numerical algorithms, and high performance computing allows for realistic evaluation of properties at length scales that are routinely reached experimentally. Two illustrations of the method are presented. First, quantum chemical calculations using up to 104 basis functions are used to investigate the amphoteric doping of carbon nanotubes by encapsulation of organic molecules. As a second example, we investigate the electron transport properties of a Si/organic molecule/Si junction using a numerically optimized basis.

BIOLOGY

287
The following article is Open access

Molecular dynamics simulations enable the study of the time evolution of molecular systems by taking many small successive time steps under atomic forces that are calculated from a parameterized set of interaction functions. These are simple functions describing bonded and non-bonded atomic interactions, so that large molecular systems can be simulated for many time steps. The simulations provide energetic and kinetic properties in the form of statistical ensemble averages. The resulting trajectories can be analyzed for a variety of geometric and kinetic properties and correlations between them.

300
The following article is Open access

, and

Particle simulations in fields ranging from biochemistry to astrophysics require evaluation of the interactions between all pairs of particles separated by less than some fixed interaction radius. The extent to which such simulations can be parallelized has historically been limited by the time required for inter-processor communication. Recently, Snir and Shaw independently introduced two distinct methods for parallelization that achieve asymptotic and practical advantages over traditional techniques. We give an overview of these methods and show that they represent special cases of a more general class of methods. We describe other methods in this class that can confer advantages over any previously described method in terms of communication bandwidth and latency. Practically speaking, the best choice among the broad category of methods depends on such parameters as the interaction radius, the size of the simulated system, and the number of processors. We analyze the best choice among a subset of these methods across a broad range of parameters.

305
The following article is Open access

and

We describe a particle-based simulator called ChemCell that we are developing with the goal of modeling the protein chemistry of biological cells for phenomena where spatial effects are important. Membranes and organelle structure are represented by triangulated surfaces. Diffusing particles represent proteins, complexes, or other biomolecules of interest. Particles interact with their neighbors in accord with Monte Carlo rules to perform biochemical reactions which can represent protein complex formation and dissociation, ligand binding, etc. In this brief paper we give the motivation for such a model, describe a few of the code's features, and highlight interesting computational issues that arise in particle-based cell modeling.

CLIMATE & EARTH SCIENCE

310
The following article is Open access

Simulation Science is now standing on a turning point. After the appearance of the Earth Simulator, HEC is struggling with several severe difficulties due to the physical limit of LSI technologies and the so-called latency problem. In this paper I would like to propose one clever way to overcome these difficulties from the simulation algorithm viewpoint. Nature and artificial products are usually organized with several nearly autonomously working internal systems (organizations, or layers). The Earth Simulator has gifted us with a really useful scientific tool that can deal with the entire evolution of one internal system with a sufficient soundness. In order to make a leap jump of Simulation Science, therefore, it is desired to design an innovative simulator that enables us to deal with simultaneously and as consistently as possible a real system that evolves cooperatively with several internal autonomous systems. Three years experience of the Earth Simulator Project has stimulated to come up with one innovative simulation algorithm to get rid of the technological barrier standing in front of us, which I would like to call "Macro-Micro Interlocked Algorithm", or "Macro-Micro Multiplying Algorithm", and present a couple of such examples to validate the proposed algorithm. The first example is an aurora-arc formation as a result of the mutual interaction between the macroscopic magnetosphere-ionosphere system and the microscopic field-aligned electron and ion system. The second example is the local heavy rain fall resulting from the interaction between the global climate evolution and the microscopic raindrop growth process. Based on this innovative feasible algorithm, I came up with a Macro-Micro Multiplying Simulator.

317
The following article is Open access

The development of climate models has a long history starting with the building of atmospheric models and later ocean models. The early researchers were very aware of the goal of building climate models which could integrate our knowledge of complex physical interactions between atmospheric, land-vegetation, hydrology, ocean, cryospheric processes, and sea ice. The transition from climate models to earth system models is already underway with coupling of active biochemical cycles. Progress is limited by present computer capability which is needed for increasingly more complex and higher resolution climate models versions. It would be a mistake to make models too complex or too high resolution. Arriving at a "feasible" and useful model is the challenge for the climate model community. Some of the climate change history, scientific successes, and difficulties encountered with supercomputers will be presented.

325
The following article is Open access

and

We have developed finite difference codes based on the Yin-Yang grid for the geodynamo simulation and the mantle convection simulation. The Yin-Yang grid is a kind of spherical overset grid that is composed of two identical component grids. The intrinsic simplicity of the mesh configuration of the Yin-Yang grid enables us to develop highly optimized simulation codes on massively parallel supercomputers. The Yin-Yang geodynamo code has achieved 15.2 Tflops with 4096 processors on the Earth Simulator. This represents 46% of the theoretical peak performance. The Yin-Yang mantle code has enabled us to carry out mantle convection simulations in realistic regimes with a Rayleigh number of 107 including strongly temperaturedependent viscosity with spatial contrast up to 106.

339
The following article is Open access

Cloud processes are very important for the global circulation of the atmosphere. It is now possible, though very expensive, to simulate the global circulation of the atmosphere using a model with resolution fine enough to explicitly represent the larger individual clouds. An impressive preliminary calculation of this type has already been performed by Japanese scientists, using the Earth Simulator. Within the next few years, such global cloud-resolving models (GCRMs) will be applied to weather prediction, and later they will be used in climatechange simulations. The tremendous advantage of GCRMs, relative to conventional lowerresolution global models, is that GCRMs can avoid many of the questionable "parameterizations" used to represent cloud effects in lower-resolution global models. Although cloud microphysics, turbulence, and radiation must still be parameterized in GCRMs, the high resolution of a GCRM simplifies these problems considerably, relative to conventional models. The United States currently has no project to develop a GCRM, although we have both the computer power and the expertise to do it. A research program aimed at development and applications of GCRMs is outlined.

343
The following article is Open access

and

A global atmosphere/land model with an embedded subgrid orography scheme is used to simulate the period 1977-2100 using ocean surface conditions and radiative constituent concentrations for a climate change scenario. Climate variables simulated for multiple elevation classes are mapping according to a high-resolution elevation dataset in ten regions with complex terrain. Analysis of changes in the simulated climate leads to the following conclusions. Changes in precipitation vary widely, with precipitation increasing more with increasing altitude in some region, decreasing more with altitude in others, and changing little in still others. In some regions the sign of the precipitation change depends on surface elevation. Changes in surface air temperature are rather uniform, with at most a two-fold difference between the largest and smallest changes within a region; in most cases the warming increases with altitude. Changes in snow water are highly dependent on altitude. Absolute changes usually increase with altitude, while relative changes decrease. In places where snow accumulates, an artificial upper bound on snow water limits the sensitivity of snow water to climate change considerably. The simulated impact of climate change on regional mean snow water varies widely, with little impact in regions in which the upper bound on snow water is the dominant snow water sink, moderate impact in regions with a mixture of seasonal and permanent snow, and profound impacts on regions with little permanent snow.

348
The following article is Open access

, , and

As part of the Arctic Ocean Model Intercomparison Project, the LANL ice-ocean modeling team completed two 55-year, global, ice-ocean simulations forced with atmospheric reanalysis data for 1948-2002. These two simulations differ only in the parameterization used for lateral mixing of tracers (potential temperature and salinity) in the ocean, but the resulting circulation and kinetic energy of the simulated oceans are very different, particularly at high latitudes. The differences can be traced to two effects, (1) scale selectivity, in which the Laplacian form of the Gent and McWilliams (GM) parameterization damps wave energy more quickly than the biharmonic mixing formulation, and (2) grid dependence of the diffusion coeficient, which appears in the biharmonic formulation but not in GM, and is particularly important at high latitudes where the grid scale decreases dramatically on the sphere. We conclude that, in order to maintain consistent suppression of numerical noise while allowing for a more energetic circulation in regions of finer grid spacing, future global simulations using the GM parameterization should include a diffusivity scaling factor given by the square root of the grid cell area.

353
The following article is Open access

, , , , , and

An adaptive grid dynamical core for a global atmospheric climate model has been developed. Adaptations allow a smooth transition from hydrostatic to non-hydrostatic physics at small resolution. The adaptations use a parallel program library for block-wise adaptive grids on the sphere. This library also supports the use of a reduced grid with coarser resolution in the longitudinal direction as the poles are approached. This permits the use of a longer time step since the CFL number restriction (CFL < 1) in a regular longitude-latitude grid is most severe in the zonal direction at high latitudes. Several tests show that our modelling procedures are stable and accurate.

ASTROPHYSICS & COSMOLOGY

358
The following article is Open access

We live in the era of the cosmological concordance model. This refers to the precise set of cosmological parameters which describe the average composition, geometry, and expansion rate of the universe we inhabit. Due to recent observational, theoretical, and computational advances, these parameters are now known to approximately 10% accuracy, and new efforts are underway to increase precision tenfold. It is found that we live in a spatially flat, dark matter-dominated universe whose rate of expansion is accelerating due to an unseen, unknown dark energy field. Baryons—the stuff of stars, galaxies, and us—account for only 4% of the total mass-energy inventory. And yet, it is through the astronomical study of baryons that we infer the rest. In this talk I will highlight the important role advanced scientific computing has played in getting us to the concordance model, and also the computational discoveries that have been made about the history of cosmic baryons using hydrodynamical cosmological simulations. I will conclude by discussing the central role that very large scale simulations of cosmological structure formation will play in deciphering the results of upcoming dark energy surveys.

370
The following article is Open access

There is a growing body of evidence that core collapse supernova explosions are inherently asymmetric. The origin of this asymmetry may arise in the first few hundred milliseconds after core collapse when the nascent shock wave is susceptible to the spherical accretion shock instability, a dynamical instability discovered through computer simulations by the SciDAC-funded Terascale Supernova Initiative.

This work was followed up by large-scale 3D simulations enabled by the application of various high-performance computing technologies including networking and visualization. Recent 3D simulations have identified a vigorous non-axisymmetric mode of this supernova shock wave that can impart a significant amount of angular momentum to the underlying neutron star.

380
The following article is Open access

and

Understanding the explosion mechanism of core collapse supernovae is a problem that has plagued nuclear astrophysicists since the first computational models of this phenomenon were carried out in the 1960s. Our current theories of this violent phenomenon center around multi-dimensional effects involving radiation-hydrodynamic flows of hot, dense matter and neutrinos. Modeling these multi-dimensional radiative flows presents a computational challenge that will continue to stress high-performance computing beyond the teraflops to the petaflop level. In this paper we describe a few of the scientific discoveries that we have made via terascale computational simulations of supernovae under the auspices of the SciDAC-funded Terascale Supernova Initiative.

390
The following article is Open access

, , , and

The computational difficulty of six-dimensional neutrino radiation hydrodynamics has spawned a variety of approximations, provoking a long history of uncertainty in the corecollapse supernova explosion mechanism. Under the auspices of the Terascale Supernova Initiative, we are honoring the physical complexity of supernovae by meeting the computational challenge head-on, undertaking the development of a new adaptive mesh refinement code for self-gravitating, six-dimensional neutrino radiation magnetohydrodynamics. This code—called GenASiS, for General Astrophysical Simulation System—is designed for modularity and extensibility of the physics. Presently in use or under development are capabilities for Newtonian self-gravity, Newtonian and special relativistic magnetohydrodynamics (with 'realistic' equation of state), and special relativistic energy- and angle-dependent neutrino transport—including full treatment of the energy and angle dependence of scattering and pair interactions.

395
The following article is Open access

and

We describe a code development and integration effort aimed at producing a numerical tool suitable for exploring the effects of stellar rotation and magnetic fields in a neutrino-driven core-collapse supernova environment. A one-dimensional, multi-energy group, flux-limited neutrino diffusion module (MGFLD) has been integrated with ZEUS-MP, a multidimensional, parallel gas hydrodynamics and magnetohydrodynamics code. With the neutrino diffusion module, ZEUS-MP can simulate the core-collapse, bounce, and explosion of a stellar progenitor in two and three space dimensions in which multidimensional hydrodynamics are coupled to 1-D neutrino transport in a ray-by-ray approximation. This paper describes the physics capabilities of the code and the technique for implementing the serial MGFLD module for parallel execution in a multi-dimensional simulation. Because the development and debugging of the integrated code is not yet complete, we provide a current status report of the effort and identify outstanding issues currently under investigation.

400
The following article is Open access

, , , , , , and

Core Collapse supernovae are among the most energetic explosions in the Universe and the predominant source of chemical elements heavier than carbon. They drive matter to extremes of density and temperature that cannot be produced on Earth and serve as laboratories for fundamental physics. Understanding these events requires the partnership of nuclear and particle physicists with astrophysicists and computational scientists. A case in point is the study of the role that captures of electrons and neutrinos on atomic nuclei play during the collapse of the core of a massive star as it reaches the end of its life. In addition to the astrophysical modeling, the required understanding of the structure of the atomic nucleus is itself a terascale application. We will discuss recent work which shows that a realistic treatment of electron capture on these nuclei causes significant changes in the hydrodynamics of core collapse and bounce, which set the stage for the subsequent evolution of the supernova.

405
The following article is Open access

, , , and

We extend a low Mach number hydrodynamics method developed for terrestrial combustion, to the study of thermonuclear flames in Type Ia supernovae. We discuss the differences between 2-D and 3-D Rayleigh-Taylor unstable flame simulations, and give detailed diagnostics on the turbulence, showing that the kinetic energy power spectrum obeys Bolgiano-Obukhov statistics in 2-D, but Kolmogorov statistics in 3-D. Preliminary results from 3-D reacting bubble calculations are shown, and their implications for ignition are discussed.

ENABLING SCIENTIFIC DISCOVERY

APPLIED MATHEMATICS

410
The following article is Open access

, and

Many plasma physics simulation applications utilize an unstructured finite element method (FEM) discretization of an elliptic operator, of the form −∇2u + αu = f; when α is equal to zero a "pure" Laplacian or Poisson equation results and when α is greater than zero a Helmholtz equation is produced. Discretized equations of this form, often 2D, occur in many tokamak fusion plasma, or burning plasma, applications — from MHD to Gyrokinetic codes. This report investigates the performance characteristics of basic classes of linear solvers (ie, direct, one-level iterative, and multilevel iterative methods) on 2D unstructured FEM problems of the form −∇2u + αu = f, with both α = 0 and α ≠ 0. The purpose of this work is to provide computational physicists guidelines as to appropriate linear solvers for their problems via detailed performance analysis, in terms of both the scalability and the constants in the solution times. We show, as expected, almost perfect scalability of multilevel methods and quantify the solution costs on a common computational platform — the IBM SP Power3.

425
The following article is Open access

Large-scale eigenvalue problems arise in a number of DOE applications. This paper provides an overview of the recent development of eigenvalue computation in the context of two SciDAC applications. We emphasize the importance of Krylov subspace methods, and point out its limitations. We discuss the value of alternative approaches that are more amenable to the use of preconditioners, and report the progress on using the multi-level algebraic sub-structuring techniques to speed up eigenvalue calculation. In addition to methods for linear eigenvalue problems, we also examine new approaches to solving two types of non-linear eigenvalue problems arising from SciDAC applications.

435
The following article is Open access

, , , , , and

We formulate the problem of designing the low-loss cavity for the International Linear Collider (ILC) as an electromagnetic shape optimization problem involving a Maxwell eigenvalue problem. The objective is to maximize the stored energy of a trapped mode in the end cell while maintaining a specified frequency corresponding to the accelerating mode. A continuous adjoint method is presented for computation of the design gradient of the objective and constraint. The gradients are used within a nonlinear optimization scheme to compute the optimal shape for a simplified model of the ILC in a small multiple of the cost of solving the Maxwell eigenvalue problem.

446
The following article is Open access

As part of the Department of Energy's applications oriented SciDAC project, three model problems have been proposed by the Center for Extended Magnetohydrodynamics Modeling to test the potential of numerical algorithms for challenging magnetohydrodynamics (MHD) problems that are required for future fusion development. The first of these, anisotropic diffusion in a toroidal geometry, is considered in this note.

456
The following article is Open access

, , , and

In this paper, we highlight new multigrid solver advances in the Terascale Optimal PDE Simulations (TOPS) project in the Scientific Discovery Through Advanced Computing (SciDAC) program. We discuss two new algebraic multigrid (AMG) developments in TOPS: the adaptive smoothed aggregation method (αSA) and a coarse-grid selection algorithm based on compatible relaxation (CR). The αSA method is showing promising results in initial studies for Quantum Chromodynamics (QCD) applications. The CR method has the potential to greatly improve the applicability of AMG.

461
The following article is Open access

, and

We describe some recent mathematical results in constructing computational methods that lead to the development of fast and accurate multiresolution numerical methods for solving problems in computational chemistry (the so-called multiresolution quantum chemistry).

Using low separation rank representations of functions and operators and representations in multiwavelet bases, we developed a multiscale solution method for integral and differential equations and integral transforms. The Poisson equation and the Schrodinger equation, the projector on the divergence free functions, provide important examples with a wide range of applications in computational chemistry, computational electromagnetic and fluid dynamics.

We have implemented these ideas that include adaptive representations of operators and functions in the multiwavelet basis and low separation rank approximation of operators and functions. These methods have been implemented into a software package called Multiresolution Adaptive Numerical Evaluation for Scientific Simulation (MADNESS).

466
The following article is Open access

, , , and

Automatic differentiation is a technique for transforming a program or subprogram that computes a function, including arbitrarily complex simulation codes, into one that computes the derivatives of that function. We describe the implementation and application of automatic differentiation tools. We highlight recent advances in the combinatorial algorithms and compiler technology that underlie successful implementation of automatic differentiation tools. We discuss applications of automatic differentiation in design optimization and sensitivity analysis. We also describe ongoing research in the design of language-independent source transformation infrastructures for automatic differentiation algorithms.

471
The following article is Open access

, , , , , and

We introduce the FronTier-Lite software package and its adaptation to the TSTT geometry and mesh entity data interface. This package is extracted from the original front tracking code for general purpose scientific and engineering applications. The package contains a static interface library and a dynamic front propagation library. It can be used in research of different scientific problems. We demonstrate the application of FronTier in the simulations of fuel injection jet, the fusion pellet injection and fluid mixing problems.

476
The following article is Open access

, , , and

Sparse systems of linear equations and eigen-equations arise at the heart of many large-scale, vital simulations in DOE. Examples include the Accelerator Science and Technology SciDAC (Omega3P code, electromagnetic problem), the Center for Extended Magnetohydrodynamic Modeling SciDAC (NIMROD and M3D-C1 codes, fusion plasma simulation). The Terascale Optimal PDE Simulations (TOPS) is providing high-performance sparse direct solvers, which have had significant impacts on these applications. Over the past several years, we have been working closely with the other SciDAC teams to solve their large, sparse matrix problems arising from discretization of the partial differential equations. Most of these systems are very ill-conditioned, resulting in extremely poor convergence (sometimes no convergence) for many iterative solvers. we have successfully deployed our direct methods techniques in these applications, which achieved significant scientific results as well as performance gains. These successes were made possible through the SciDAC model of computer scientists and application scientists working together to take full advantage of terascale computing systems and new algorithms research.

481
The following article is Open access

, and

We seek to improve on the conventional FFT-based algorithms for solving the Poisson equation with infinite-domain (open) boundary conditions for large problems in accelerator modeling and related areas. In particular, improvements in both accuracy and performance are possible by combining several technologies: the method of local corrections (MLC); the James algorithm; and adaptive mesh refinement (AMR).

The MLC enables the parallelization (by domain decomposition) of problems with large domains and many grid points. This improves on the FFT-based Poisson solvers typically used as it doesn't require the all-to-all communication pattern that parallel 3d FFT algorithms require, which tends to be a performance bottleneck on current (and foreseeable) parallel computers. In initial tests, good scalability up to 1000 processors has been demonstrated for our new MLC solver. An essential component of our approach is a new version of the James algorithm for infinite-domain boundary conditions for the case of three dimensions. By using a simplified version of the fast multipole method in the boundary-to-boundary potential calculation, we improve on the performance of the Hockney algorithm typically used by reducing the number of grid points by a factor of 8, and the CPU costs by a factor of 3. This is particularly important for large problems where computer memory limits are a consideration.

The MLC allows for the use of adaptive mesh refinement, which reduces the number of grid points and increases the accuracy in the Poisson solution. This improves on the uniform grid methods typically used in PIC codes, particularly in beam problems where the halo is large. Also, the number of particles per cell can be controlled more closely with adaptivity than with a uniform grid.

To use AMR with particles is more complicated than using uniform grids. It affects depositing particles on the non-uniform grid, reassigning particles when the adaptive grid changes and maintaining the load balance between processors as grids and particles move. New algorithms and software are being developed to solve these problems efficiently.

We are using the Chombo AMR software framework as the basis for this work.

486
The following article is Open access

, , and

Software components for representing and evaluating geometry (TSTTG/CGM) and finite element mesh (TSTTM/MOAB), and a higher-level component for relations between the two (TSTTR/LASSO), have been combined with electromagnetic modelling and optimization techniques, to form a SciDAC shape optimization application. The TSTT data model described in this paper allows components involved in the shape optimization application to be coupled at a variety of levels, from coarse black-box coupling (e.g. to generate a model accelerator cavity using TSTTG) to very fine-grained coupling (e.g. smoothing individual mesh elements based in part on geometric surface normals at mesh vertices). Despite this flexibility, the TSTT data model uses only four fundamental data types (entities, sets, tags, and the interface object itself). We elaborate on the design and implementation of effective components in the context of this application, and show how our simple but flexible data model facilitates these efforts.

COMPUTER SCIENCE

491
The following article is Open access

, , , , , and

The SciDAC program of the Department of Energy has brought together tremendous scientific expertises and computing resources to realize the promise of terascale computing for attempting to answer some of the most important basic science questions. Scientific visualization is an indispensable path to gleaning insight from the massive data produced by terascale simulations. Unless the visualization challenges presented by the terascale simulations be adequately addressed, the value of conducting these immense and costly simulations is not being fully realized. In this paper, we introduce several key visualization technologies that address the critical need of many SciDAC scientists in the application areas from accelerator simulations, earthquake modeling, plasma physics, supernova modeling, to turbulent combustion simulations.

501
The following article is Open access

, and

The data access needs of computational science applications continue to grow, both in terms of the rates of I/O necessary to match compute capabilities, and in terms of the features required of I/O systems. Particularly, wide-area access to data, and moving data between systems, has become a priority.

Key achievements in I/O for computational science have enabled many applications to effectively use I/O resources. However, growing application requirements challenge I/O system developers to create solutions that will make I/O systems easier to use, improve performance, and increase the manageability. In this work we outline the recent achievements and current status of I/O systems for computational science. We then enumerate key challenges for I/O systems in the near future and discuss ongoing efforts that address these challenges.

510
The following article is Open access

, , , , , , , and

Fusion energy science, like other science areas in DOE, is becoming increasingly data intensive and network distributed. We discuss data management techniques that are essential for scientists making discoveries from their simulations and experiments, with special focus on the techniques and support that Fusion Simulation Project (FSP) scientists may need. However, the discussion applies to a broader audience since most of the fusion SciDAC's, and FSP proposals include a strong data management component. Simulations on ultra scale computing platforms imply an ability to efficiently integrate and network heterogeneous components (computational, storage, networks, codes, etc), and to move large amounts of data over large distances. We discuss the workflow categories needed to support such research as well as the automation and other aspects that can allow an FSP scientist to focus on the science and spend less time tending information technology.

521
The following article is Open access

, and

The Optimized Sparse Kernel Interface (OSKI) is a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. These kernels include sparse matrix-vector multiply and sparse triangular solve, among others. The primary aim of this interface is to hide the complex decisionmaking process needed to tune the performance of a kernel implementation for a particular user's sparse matrix and machine, while also exposing the steps and potentially non-trivial costs of tuning at run-time. This paper provides an overview of OSKI, which is based on our research on automatically tuned sparse kernels for modern cache-based superscalar machines.

531
The following article is Open access

and

The absence of an adequate distributed storage infrastructure for data buffering has become a significant impediment to the flow of work in the wide area, data intensive collaborations that are increasingly characteristic of leading edge research in several fields. One solution to this problem, pioneered under DOE's SciDAC program, is Logistical Networking, which provides a framework for a globally scalable, maximally interoperable storage network based on the Internet Backplane Protocol (IBP). This paper provides a brief overview of the Logistical Networking (LN) architecture, the middleware developed to exploit its value, and a few of the applications that some of research communities have made of it.

536
The following article is Open access

, , , , , , , , , et al

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.

541
The following article is Open access

, , , , , , and

Large-scale science computations and experiments require unprecedented network capabilities in the form of large bandwidth and dynamically stable connections to support data transfers, interactive visualizations, and monitoring and steering operations. A number of component technologies dealing with the infrastructure, provisioning, transport and application mappings must be developed and/or optimized to achieve these capabilities. We present a brief account of the following technologies that contribute toward achieving these network capabilities: (a) DOE UltraScienceNet and NSF CHEETAH network testbeds that provide on-demand and scheduled dedicated network connections; (b) experimental results on transport protocols that achieve close to 100% utilization on dedicated 1Gbps wide-area channels; (c) a scheme for optimally mapping a visualization pipeline onto a network to minimize the end-to-end delays; and (d) interconnect configuration and protocols that provides multiple Gbps flows from Cray X1 to external hosts.

546
The following article is Open access

, , , , , , , , , et al

The growth in computing resources at scientific computing centers has created new challenges for system software. These multi-teraflop systems often exceed the capabilities of the system software and require new approaches to accommodate these large processor counts. The costs associated with development and maintenance of this software are also significant impediments, which are compounded by a lack of interoperability because of site-specific enhancements. The Scalable System Software project seeks to address these issues through a component based approach to system software development. An overview of this design and the benefits of such an approach will be discussed in this paper.

551
The following article is Open access

, , , , , , , , , et al

The performance of the Eulerian gyrokinetic-Maxwell solver code GYRO is analyzed on five high performance computing systems. First, a manual approach is taken, using custom scripts to analyze the output of embedded wallclock timers, floating point operation counts collected using hardware performance counters, and traces of user and communication events collected using the profiling interface to Message Passing Interface (MPI) libraries. Parts of the analysis are then repeated or extended using a number of sophisticated performance analysis tools: IPM, KOJAK, SvPablo, TAU, and the PMaC modeling tool suite. The paper briefly discusses what has been discovered via this manual analysis process, what performance analyses are inconvenient or infeasible to attempt manually, and to what extent the tools show promise in accelerating or significantly extending the manual performance analyses.

556
The following article is Open access

FastBit is a software tool for searching large read-only datasets. It organizes user data in a column-oriented structure which is efficient for on-line analytical processing (OLAP), and utilizes compressed bitmap indices to further speed up query processing. Analyses have proven the compressed bitmap index used in FastBit to be theoretically optimal for onedimensional queries. Compared with other optimal indexing methods, bitmap indices are superior because they can be efficiently combined to answer multi-dimensional queries whereas other optimal methods can not. In this paper, we first describe the searching capability of FastBit, then briefly highlight two applications that make extensive use of FastBit, namely Grid Collector and DEX.

SCIENCE GRIDS AND COLLABORATORIES

561
The following article is Open access

, , , , , , and

Active Thermochemical Tables (ATcT) are a good example of a significant breakthrough in chemical science that is directly enabled by the US DOE SciDAC initiative. ATcT is a new paradigm of how to obtain accurate, reliable, and internally consistent thermochemistry and overcome the limitations that are intrinsic to the traditional sequential approach to thermochemistry. The availability of high-quality consistent thermochemical values is critical in many areas of chemistry, including the development of realistic predictive models of complex chemical environments such as combustion or the atmosphere, or development and improvement of sophisticated high-fidelity electronic structure computational treatments. As opposed to the traditional sequential evolution of thermochemical values for the chemical species of interest, ATcT utilizes the Thermochemical Network (TN) approach. This approach explicitly exposes the maze of inherent interdependencies normally ignored by the conventional treatment, and allows, inter alia, a statistical analysis of the individual measurements that define the TN. The end result is the extraction of the best possible thermochemistry, based on optimal use of all the currently available knowledge, hence making conventional tabulations of thermochemical values obsolete. Moreover, ATcT offer a number of additional features that are neither present nor possible in the traditional approach. With ATcT, new knowledge can be painlessly propagated through all affected thermochemical values. ATcT also allows hypothesis testing and evaluation, as well as discovery of weak links in the TN. The latter provides pointers to new experimental or theoretical determinations that can most efficiently improve the underlying thermochemical body of knowledge.

571
The following article is Open access

, , , and

A particularly demanding and important challenge that we face as we attempt to construct the distributed computing machinery required to support SciDAC goals is the efficient, high-performance, reliable, secure, and policy-aware management of large-scale data movement. This problem is fundamental to diverse application domains including experimental physics (high energy physics, nuclear physics, light sources), simulation science (climate, computational chemistry, fusion, astrophysics), and large-scale collaboration. In each case, highly distributed user communities require high-speed access to valuable data, whether for visualization or analysis. The quantities of data involved (terabytes to petabytes), the scale of the demand (hundreds or thousands of users, data-intensive analyses, real-time constraints), and the complexity of the infrastructure that must be managed (networks, tertiary storage systems, network caches, computers, visualization systems) make the problem extremely challenging. Data management tools developed under the auspices of the SciDAC Data Grid Middleware project have become the de facto standard for data management in projects worldwide. Day in and day out, these tools provide the "plumbing" that allows scientists to do more science on an unprecedented scale in production environments.

576
The following article is Open access

The heaviest known Fermion particle—the top quark—was discovered at Fermilab in the first run of the Tevatron in 1995. However, besides its mere existence one needs to study its properties precisely in order to verify or falsify the predictions of the Standard Model. With the top quark's extremely high mass and short lifetime such measurements probe yet unexplored regions of the theory and bring us closer to solving the open fundamental questions of our universe of elementary particles such as why three families of quarks and leptons exist and why their masses differ so dramatically.

To perform these measurements hundreds of millions of recorded proton-antiproton collisions must be reconstructed and filtered to extract the few top quarks produced. Simulated background and signal events with full detector response need to be generated and reconstructed to validate and understand the results. Since the start of the second run of the Tevatron the DØ collaboration has brought Grid computing to its aid for the production of simulated events. Data processing on the Grid has recently been added and thereby enabled us to effectively triple the amount of data available with the highest quality reconstruction methods.

We will present recent top quark results DØ obtained from these improved data and explain how they benefited from the availability of computing resources on the Grid.

586
The following article is Open access

and

Fusion energy research aims to develop an economically and environmentally sustainable energy system. The tokamak, a doughnut shaped plasma confined by magnetic fields generated by currents flowing in external coils and the plasma, is a leading concept. Advanced Tokamak (AT) research in the DIII-D tokamak seeks to provide a scientific basis for steady-state high performance operation. This necessitates replacing the inherently pulsed inductive method of driving plasma current. Our approach emphasizes high pressure to maximize fusion gain while maximizing the self-driven bootstrap current, along with external current profile control. This requires integrated, simultaneous control of many characteristics of the plasma with a diverse set of techniques. This has already resulted in noninductive conditions being maintained at high pressure on current relaxation timescales. A high degree of physical understanding is facilitated by a closely coupled integrated modelling effort. Simulations are used both to plan and interpret experiments, making possible continued development of the models themselves. An ultimate objective is the capability to predict behaviour in future AT experiments. Analysis of experimental results relies on use of the TRANSP code via the FusionGrid, and our use of the FusionGrid will increase as additional analysis and simulation tools are made available.

591
The following article is Open access

, and

ESnet provides authentication services and trust federation support for SciDAC projects, collaboratories, and other distributed computing applications. The ESnet ATF team operates the DOEGrids Certificate Authority, available to all DOE Office of Science programs, plus several custom CAs, including one for the National Fusion Collaboratory and one for NERSC. The secure hardware and software environment developed to support CAs is suitable for supporting additional custom authentication and authorization applications that your program might require. Seamless, secure interoperation across organizational and international boundaries is vital to collaborative science. We are fostering the development of international PKI federations by founding the TAGPMA, the American regional PMA, and the worldwide IGTF Policy Management Authority (PMA), as well as participating in European and Asian regional PMAs. We are investigating and prototyping distributed authentication technology that will allow us to support the "roaming scientist" (distributed wireless via eduroam), as well as more secure authentication methods (one-time password tokens).

596
The following article is Open access

, , , , , , , , , et al

Computational scientists often develop large models and codes intended to be used by larger user communities or for repetitive tasks such as parametric studies. Lowering the barrier of entry for access to these codes is often a technical and sociological challenge. Portals help bridge the gap because they are well known interfaces enabling access to a large variety of resources, services, applications, and tools for private, public, and commercial entities, while hiding the complexities of the underlying software systems to the user. This paper presents an overview of the current state-of-the-art in grid portals, based on a component approach that utilizes portlet frameworks and the most recent Grid standards, the Web Services Resource Framework and a summary of current DOE portal efforts.

601
The following article is Open access

SciDAC has invested heavily in climate change research. We offer a candid opinion as to the impact of the DOE laboratories' SciDAC projects on the upcoming Fourth Assessment Report of the Intergovernmental Panel on Climate Change.

CLOSING REMARKS

606
The following article is Open access

Closing remarks and review of SciDAC2005 Conference by the Director of the Scientific Discovery through Advanced Computing Research Program of the the US Department of Energy.