Abstract
The growing power of parallel supercomputers gives scientists the ability to simulate more complex problems at higher fidelity, leading to many high-impact scientific advances. To maximize the utilization of the vast amount of data generated by these simulations, scientists also need scalable solutions for studying their data to different extents and at different abstraction levels. As we move into peta- and exa-scale computing, simply dumping as much raw simulation data as the storage capacity allows for post-processing analysis and visualization is no longer a viable approach. A common practice is to use a separate parallel computer to prepare data for subsequent analysis and visualization. A naive realization of this strategy not only limits the amount of data that can be saved, but also turns I/O into a performance bottleneck when using a large parallel system. We conjecture that the most plausible solution for the peta- and exa-scale data problem is to reduce or transform the data in-situ as it is being generated, so the amount of data that must be transferred over the network is kept to a minimum. In this paper, we discuss different approaches to in-situ processing and visualization as well as the results of our preliminary study using large-scale simulation codes on massively parallel supercomputers.