Abstract
DMTCP (Distributed MultiThreaded CheckPointing) is a mature checkpoint–restart package. It operates in user space without kernel privilege, and adapts to application-specific requirements through plugins. While DMTCP has been able to checkpoint Python and IPython 'from the outside' for many years, a Python module has recently been created to support DMTCP. IPython support is included through a new DMTCP plugin. A checkpoint can be requested interactively within a Python session or under the control of a specific Python program. Further, the Python program can execute specific Python code prior to checkpoint, upon resuming (within the original process) and upon restarting (from a checkpoint image). Applications of DMTCP are demonstrated for: (i) Python-based graphics using virtual network client, (ii) a fast/slow technique to use multiple hosts or cores to check one (Cython Behnel S et al 2011 Comput. Sci. Eng. 13 31–39) computation in parallel, and (iii) a reversible debugger, FReD, with a novel reverse-expression watchpoint feature for locating the cause of a bug.
1. Introduction
DMTCP (Distributed MultiThreaded CheckPointing) [1] is a mature user space checkpoint–restart package. Checkpoint–restart is the ability to store the current state of a running application to a checkpoint image and later restart the application from the checkpoint image. In addition to the obvious use in recovering from a system failure, some other use cases are:
- Save/restore workspace: interactive languages such as R [25] and MATLAB [18] frequently include their own 'save/restore workspace' commands.
- 'Undump' capability: programs that would otherwise have long startup times often create a custom 'dump/undump' facility. The software is then built, dumped after startup and re-built to package a 'checkpoint' along with an undump routine.
- Applications with CPU-intensive front-end and interactive analysis of results at back-end: run on high performance hosts or clusters and restart all processes on a single laptop
- Debugging: the checkpoint image is the 'ultimate bug report' for long-running processes; it uses; gdb or another debugger to attach to a restarted process. For distributed processes, all processes are restarted on a single host. Reversibility can be added to an existing debugger by checkpointing the entire debugging session (along with the debugger).
One can view checkpoint–restart as a generalization of serialization and deserialization which is provided by Python's built-in 'pickle' module [24]. Instead of saving an object to a file, one saves the entire Python session to a file. Checkpointing graphics in Python is also supported—by checkpointing a virtual network client (VNC) session with Python running inside that session.
DMTCP is available as a Linux package for many popular Linux distributions. DMTCP can checkpoint Python or IPython [21] from the outside, i.e. by treating Python as a black box. To enable checkpointing, the Python interpreter is launched in the following manner:
$ dmtcp_checkpoint python |
$ dmtcp_command -checkpoint |
The command 'dmtcp_command' can be used at any point to create a checkpoint of the entire session.
However, most Python programmers will prefer to request a checkpoint interactively within a Python session or else programmatically from inside a Python program.
DMTCP is made accessible to Python programmers as a Python module. Hence, a checkpoint is executed as 'import dmtcp; dmtcp.checkpoint()'. This Python module provides this and other functions to support the features of DMTCP. The module for DMTCP functions equally well in IPython.
This DMTCP module implements a generalization of a saveWorkspace function, which additionally supports graphics and the distributed processes of IPython. In addition, three novel uses of DMTCP for helping debug Python are discussed.
- (1)Fast/slow computation—Cython [6] provides both traditional interpreted functions and compiled C functions. Interpreted functions are slow, but correct. Compiled functions are fast, but users sometimes declare incorrect C types, causing the compiled function silently return a wrong answer. The idea of fast/slow computation is to run the compiled version on one computer node, while creating checkpoint images at regular intervals. Separate computer nodes are used to check each interval in interpreted mode between checkpoints. (See below for a brief description of Cython.)
- (2)FReD—a fast reversible debugger that works closely with the Python debugger (pdb) [23], as well as other Python debuggers.
- (3)Reverse expression watchpoint—this is a novel feature within the FReD reversible debugger. Assume a bug occurred in the past. It is associated with the point in time when a certain expression changed. Bring the user back to a pdb session at the step before the bug occurred.
Brief overview of IPython and Cython. The 'I' in IPython stands for 'interactive'. IPython [21] includes many shell-like features in Python. It also provides interactive support for parallel jobs, a notebook interface for publishing on the web, non-blocking graphics, and other features.
Cython [5, 6] is a superset of Python that includes an incremental compiler and a foreign function interface. The Cython interpreter is compatible with the standard Python interpreter, while also including the ability to inter-mix interpreted Python functions, compiled Python functions (compiled via Cython's Python-to-C translator), and C/C++ functions. The foreign function interface allows the three types of functions to call each other. The Python-to-C translator requires type declarations for correct translation, and so can lead to subtle bugs when used by inexperienced users.
Rest of the paper. Section 2 describes a Python module for the integration of DMTCP with Python. Section 3 describes checkpointing Python based graphics using VNC. Section 4 describes a technique for checking Cython with multiple CPython instances. We discuss a reversible debugger for Python in section 5. Related work on checkpointing is presented in section 6, and the conclusion is presented in section 7. There are two appendices: appendix
2. DMTCP-Python integration through a Python module
A Python module, 'dmtcp.py', has been created to support checkpointing both from within an interactive Python/IPython session and programmatically from within a Python program. DMTCP has been able to asynchronously generate checkpoints of a Python session for many years. However, most users prefer the more fine-grained control of a Python programmatic interface to DMTCP. This allows one to avoid checkpointing in the middle of a communication with an external server or other atomic transaction.
2.1. A Python module to support DMTCP
Some of the features of 'module.py' are best illustrated through an example. Here, a checkpoint request is made from within the application.
import dmtcp |
... |
# Request a checkpoint if running under checkpoint control |
dmtcp.checkpoint() |
# Checkpoint image has been created |
It is also easy to add pre- and post-checkpoint processing actions.
import dmtcp |
... |
def my_ckpt |
# Pre processing |
my_pre_ckpt_hook |
... |
# Create checkpoint |
dmtcp.checkpoint() |
# Checkpoint image has been created |
... |
if dmtcp.isResume(): |
# The process is resuming from a checkpoint |
my_resume_hook |
... |
else: |
#The process is restarting from a previous checkpoint |
my_restart_hook |
... |
return |
The function my_ckpt can be defined in the application by the user and can be called from within the user application at any point.
2.2. Extending the DMTCP module for managing sessions
These core checkpoint–restart services are further extended to provide the user with the concept of multiple sessions. A checkpointed Python session is given a unique session id to distinguish it from other sessions. When running interactively, the user can view the list of available checkpointed sessions. The current session can be replaced by any of the existing session using the session identifier.
The application can programmatically revert to an earlier session as shown in the following example:
import dmtcp |
... |
sessionId1 = dmtcp.checkpoint() |
... |
sessionId2 = dmtcp.checkpoint() |
... |
if |
dmtcp.restore(sessionId2) |
else: |
dmtcp.restore(sessionId1) |
Note that only session id is used to restore to a previous session. It is also possible to enhance the DMTCP module to pass extra arguments to the restore function. Those extra arguments can be made available to the dmtcp.isRestart() path. The application can thus take a different branch now instead of following the same route.
2.3. Save-restore for Ipython sessions
To checkpoint an IPython session, one must consider the configuration files. The configuration files are typically stored in the user's home directory. During restart, if the configuration files are missing, the restarted computation may fail to continue. Thus, DMTCP must checkpoint and restore all the files required for proper restoration of an IPython session.
Attempting to restore all configuration files during restart poses yet another problem: the existing configuration files might have newer contents. Overwriting these newer files with copies from the checkpoint time may result in the loss of important changes.
To avoid overwriting the existing configuration files, the files related to an IPython session are restored in a temporary directory. Whenever the IPython shell attempts to open a file in the original configuration directory, the filepath is updated to point to the temporary directory. Thus, the files in the original configuration directory are never modified. Further, the translation from original to temporary path is transparent to the IPython shell.
2.4. Save-restore for parallel IPython sessions
DMTCP is capable of checkpointing a distributed computations with processes running on multiple nodes. It automatically checkpoints and restores various kinds of inter-process communication mechanisms such as shared-memory, message queues, pseudo-ttys, pipes and network sockets.
An IPython session involving a distributed computation running on a cluster is checkpointed as a single unit. DMTCP allows restarting the distributed processes in a different configuration than the original. For example, all the processes can be restarted on a single computer for debugging purposes. In another example, the computation may be restarted on a different cluster altogether.
3. Checkpointing Python-based graphics
Python is popular for scientific visualizations. It is possible to checkpoint a Python session with active graphics windows by using VNC. DMTCP supports checkpoint–restart of the VNC server. A recent work by Nafchi et al [14] details how to checkpoint hardware-accelerated 3D graphics using DMTCP. Here we focus on the VNC approach.
The DMTCP module can automatically start a VNC server. The process environment is modified to allow the Python interpreter to communicate with the VNC server instead of the X-window server. For visualization, a VNC client can be fired automatically to display the graphical window. During checkpoint, the VNC server is checkpointed as part of the computation, while the VNC client is not. During restart, the Python session and the VNC server are restored from their checkpoint images, and a fresh VNC client is launched. This VNC client communicates with the restored server and displays the graphics to the end user.
import dmtcp |
... |
# Start VNC server |
dmtcp.startGraphics() |
... |
# Start VNC viewer |
dmtcp.showGraphics() |
# generate graphics (will be shown in the VNC viewer) |
To understand the algorithm behind the code, we recall some VNC concepts. X-window supports multiple virtual screens. A VNC server creates a new virtual screen. The graphics contained in the VNC server is independent of any X-window screen. The VNC server process persists as a daemon. A VNC viewer displays a specified virtual screen in a window in a console. When Python generates graphics, the graphics are sent to a virtual screen specified by the environment variable $DISPLAY
The command dmtcp.startGraphics() creates a new X-window screen by creating a new VNC server and sets the $DISPLAY environment variable to the new virtual screen. All Python graphics are now sent to this new virtual screen. The additional screen is invisible to the Python user until the Python command dmtcp.showGraphics() is given. The Python command dmtcp.showGraphics() operates by invoking a VNC viewer.
At the time of checkpoint, the VNC server process is checkpointed along with the Python interpretor while the VNC viewer is not checkpointed.
On restart, the VNC server detects the stale connection to the old VNC viewers. The VNC server perceives this as the VNC viewer process that has now died. The DMTCP module then launches a new VNC viewer to connect to the VNC server.
4. Checking Cython with multiple CPython instances
A common problem for compiled versions of Python such as Cython [6] is how to check whether the compiled computation is faithful to the interpreted computation. Runtime errors can occur if the compiled code assumes a particular C type, and the computation violates that assumption for a particular input. Thus, one has to choose between speed of computation and a guarantee that that the compiled computation is faithful to the interpreted computation.
A typical scenario might be a case in which the compiled Cython version ran for hours and produced an unexpected answer. One wishes to also check the answer in a matter of hours, but pure Python (CPython) would take much longer.
Informally, the solution is known as a fast/slow technique. There is one fast process (Cython), whose correctness is checked by multiple slow processes (CPython). The core idea is to run the compiled code, while creating checkpoint images at regular intervals. A compiled computation interval is checked by copying the two corresponding checkpoints (at the beginning and end of the interval) to a separate computer node for checking. The computation is restarted from the first checkpoint image, on the checking node, but when the computation is first restarted, the variables for all user Python functions are set to the interpreted function object. The interval of computation is then re-executed in interpreted mode until the end of the computation interval. The results at the end of that interval can then be compared to the results at the end of the same interval in compiled mode.
Figure 1 illustrates the above idea. A similar idea has been used by [11] for distributed speculative parallelization.
Figure 1. Fast Cython with slow CPython 'checking' nodes.
Download figure:
Standard image High-resolution imageNote that in order to compare the results at the end of a computation interval, it is important that the interpreted version on the checker node stop exactly at the end of the interval, in order to compare with the results from the checkpoint at the end of the same interval. The simplest way to do this is to add a counter to a frequently called function of the end-user code. The counter is incremented each time the function is called. When the counter reaches a pre-arranged multiple (for example, after every million calls), the compiled version can generate a checkpoint and write to a file the values of variables indicating the state of the computation. The interpreted version writes to a file the values of variables indicating its own state of the computation.
mycounter = 0 |
def freq_called_user_fnc |
global mycounter |
mycounter += 1 |
if mycounter % 1000000 == 0: |
# if running as Cython: |
if type(freq_called_user_fnc) == type(range): |
# write curr. program state to cython.log |
dmtcp.checkpoint() |
if dmtcp.isRestart(): |
#On restart from ckpt image, switch to pure Python. |
else: # else running as pure Python |
# write curr. program state to purePython.log |
sys.exit(0) |
... |
# original body of freq_called_user_fnc |
return |
The above code block illustrates the principles. One compares cython.log and purePython.log to determine if the compiled code was faithful to the interpreted code. If the Cython code consists of direct C calls between functions, then it will also be necessary to modify the functions of the C code generated by Cython, to force them to call the pure Python functions on restart after a checkpoint.
5. Reversible debugging with FReD
While debugging a program, often the programmer oversteps and has to restart the debugging session. For example, while debugging a program, if the programmer steps over (by issuing next command inside the debugger) a function f() only to determine that the bug is in function f() itself, he or she is left with no choice but to restart from the beginning.
Reversible debugging is the capability to run an application 'backwards' in time inside a debugger. If the programmer detects that the problem is in function f(), instead of restarting from the beginning, the programmer can issue a reverse-next command which takes it to the previous step. He or she can then issue a step command to step into the function in order to find the problem.
FReD [2, 3] is a reversible debugger based on checkpoint–restart. FReD is implemented as a set of Python scripts and uses DMTCP to create checkpoints during the debugging session. FReD also keeps track of the debugging history. Figure 2 shows the architecture of FReD.
Figure 2. Fast reversible debugger.
Download figure:
Standard image High-resolution image5.1. A simple UNDO command
The UNDO command reverses the effect of a previous debugger command such as next, continue or finish. This is the most basic of reversible debugging commands.
The functionality of the UNDO command for debugging Python is trivially implemented. A checkpoint is taken at the beginning of the debugging session and a list of all debugging commands issued since the checkpoint are recorded.
To execute the UNDO command, the debugging session is restarted from the checkpoint image, and the debugging commands are automatically re-executed from the list excluding the last command. This takes the process back to before the debugger command was issued.
In longer debugging sessions, checkpoints are taken at frequent intervals to reduce the time spent in replaying the debugging history.
5.2. More complex reverse commands
Figure 3 shows some typical debugging commands being executed in forward as well as backward direction in time.
Figure 3. Reverse commands.
Download figure:
Standard image High-resolution imageSuppose that the debugging history appears as [next,next] i.e. the user issued two next commands. Further, the second next command stepped over a function f(). Suppose further that FReD takes checkpoints before each of these commands. In this situation, the implementation of a reverse-next command is trivial: one restarts from the last checkpoint image. However, if the command issued were reverse-step, simply restarting from the previous checkpoint would not suffice.
In this last case, the desired behavior is to take the debugger to the last statement of the function f(). In such a situation one needs to decompose the last command into a series of commands. At the end of this decomposition, the last command in the history is a step. At this point, the history may appear as: [next,step,next, ...,next,step]. The process is then restarted from the last checkpoint and the debugging history is executed excluding the last step command. Decomposing a command into a series of commands terminating with step is non-trivial, and an algorithm for that decomposition is presented in [28] .
5.2.1. Debugging Python code with FReD
FReD defines a set of commands to be used to interact with the reversible debugger. These commands involve high-level UNDO and reverse commands as well as low-level commands that allows the user to interact with the checkpointer itself.
FReD intercepts and logs the debugging commands typed by the user. In addition to logging the commands, it also executes the FReD-specific commands (e.g., reverse commands). Commands that are not FReD-specific are are forwarded to the underlying debugger.
The reverse commands are only available after a checkpoint has been taken by FReD. Thus, in a typical debugging session, the user stops the program at the very beginning (i.e. at main routine) and asks for a checkpoint using the fred-checkpoint command. From this point onwards, normal debugging commands may be used along with the reverse commands. The following listing shows a typical debugging session with FReD:
# fredapp.py python -mpdb a.py |
(Pdb) break main |
(Pdb) run |
(Pdb) fred-checkpoint |
(Pdb) break 6 |
(Pdb) continue |
(Pdb) fred-history |
[break 6, continue] |
(Pdb) fred-reverse-next |
(Pdb) fred-history |
[break 7, next, next, next, next, next, next, next, |
next, next, next, step, next, next, next, where] |
Note that once the reverse-next command is executed in the above debugging session, the command history inside pdb is modified to reflect the decomposed continue command. This shows the path taken by the debugger to reach the current state in the Python program.
5.3. Reverse expression watchpoints
The reverse expression watchpoint [3] automatically finds the location of the fault for a given expression in the history of the program execution. It brings the user directly to a statement (one that is not a function call) at which the expression is correct, but executing the statement will cause the expression to become incorrect.
Figure 4 provides a simple example. Assume that a bug occurs whenever a linked list has a length longer than one million. So an expression linked_list.len() is assumed to be true throughout. Assume that it is too expensive to frequently compute the length of the linked list, since this would require time in what would otherwise be an O(n) time algorithm. (A more sophisticated example might consider a bug in an otherwise duplicate-free linked list or an otherwise cycle-free graph. But the current example is chosen for ease of illustrating the ideas.)
Figure 4. Reverse expression watchpoint.
Download figure:
Standard image High-resolution imageIf the length of the linked list is less than or equal to one million, we will call the expression 'good'. If the length of the linked list is greater than one million, we will call the expression 'bad'. A 'bug' is defined as a transition from 'good' to 'bad'. There may be more than one such transition or bug over the process lifetime. Our goal is simply to find any one occurrence of the bug.
The core of a reverse expression watchpoint is a binary search. In figure 4, assume a checkpoint was taken near the beginning of the time interval. So, we can revert to any point in the illustrated time interval by restarting from the checkpoint image and re-executing the history of debugging commands until the desired point in time.
Since the expression is 'good' at the beginning of figure 4 and it is 'bad' at the end of that figure, a buggy statement must exist—a statement exhibiting the transition from 'good' to 'bad'. A standard binary search algorithm converges to a case in which the current statement is 'good' and the next statement transitions from 'good' to 'bad'. By the earlier definition of a 'bug', FReD has found a statement with a bug. This represents success.
If implemented naively, this binary search requires that some statements may need to be re-executed up to times. However, FReD can also create intermediate checkpoints. In the worst case, one can form a checkpoint at each phase of the binary search. In that case, no particular sub-interval over the time period needs to be executed more than twice.
A typical use of reverse-expression-watchpoint is shown in the following listing:
# ./fredapp.py python -mpdb ./autocount.py |
import sys, time |
(Pdb) break 21 |
Breakpoint 1 at /home/kapil/fred/autocount.py:21 |
(Pdb) continue |
/home/kapil/fred/autocount.py(21) |
# Required for fred-reverse-watch |
(Pdb) fred-checkpoint |
(Pdb) break 28 |
Breakpoint 2 at /home/kapil/fred/autocount.py:28 |
(Pdb) continue |
... ... |
/home/kapil/fred/autocount.py(28) () |
(Pdb) print num |
10 |
(Pdb) fred-reverse-watch num 5 |
(Pdb) print num |
4 |
(Pdb) next |
(Pdb) print num |
5 |
The user puts a breakpoint toward the end of the program to determine the value of the variable num. The current value of num is '10' and the user wants to get to the point in time where the expression 'num 5' turned to 'false' for the first time. The command fred-reverse-watch takes the program to the point where the current value of num is '4' and executing the 'next' command would change the value to '5'.
6. Related work
Checkpoint–restart has a long history, with early applications concentrating on high performance computing. Transparent checkpointing (sometimes called system-level checkpointing) refers to checkpointing packages that do not require modifications to their target applications. Transparent checkpoint–restart has existed at least since 1990 [16] and was popularized in the libckpt system [22]. Another early checkpointing package, ckpt [29], is still in use in the Vanilla Universe of the Condor high-throughput computing system. However, that package is restricted to single-threaded programs. BLCR [12] is implemented through a Linux kernel module, which can checkpoint process trees. CryoPid [19] uses the Linux ptrace system call (similar to the strategy of the GDB debugger) to place the target process under the control of a 'superior' process. CRIU [9] also uses the 'ptrace' system call to transparently checkpoint processes running inside a Linux container. Linux containers [17] are lightweight virtual machines with private namespaces for kernel objects. The Zap system [20] implements a kernel module providing a thin virtualization layer to support pods, groups of processes with a consistent virtualized view, although Zap does not appear to currently be in wide use.
In contrast to transparent checkpointing, application-level checkpointing places some burden on the programmer to add to each application some information for checkpointing that particular application. Due to the difficulty of distributed checkpointing, the earliest examples used the application-level approach. Application-level checkpointing for distributed programs dates back at least to 1997 [4]. Bronevetsky et al produced a non-blocking application-level checkpointing design for the special case of Message Passing Interface (MPI) [7]. Even today, most transparent checkpointing packages that support distributed checkpointing do so for the special case of MPI programs. They modify the MPI internals to 'tear down' any network connections just before checkpointing, then use a single-host checkpointing package for the actual checkpoint, and then rebuild the network connections.
Several earlier packages used a kernel-level approach (kernel modules or a modified operating system kernel) to support native transparent checkpointing [12, 13, 15, 20, 27], although those packages are no longer in common use. DMTCP appears to be the first package to natively support transparent distributed checkpointing in user-space [1]. Similarly, DMTCP appears to be the first package to natively support transparent distributed checkpointing over the high performance InfiniBand network, which is now increasingly becoming a standard for computer clusters [8].
While native distributed checkpointing (no dependence on MPI) is a strict prerequisite for the work described here, the ability to extend the checkpointing functionality using plugins is an important practical prerequisite. Without that, the sheer magnitude of maintaining a large monolithic software package would threaten maintainability. Plugins (see section
7. Conclusion
DMTCP is a widely used standalone checkpoint–restart package. We have shown that it can be closely integrated with Python. Specifically, parallel sessions with IPython, alternating interpreted and compiled execution modes, graphics, and enhancing the Python debugger with reversibility. The implementation can be extended by the end users to augment the capabilities of Python beyond the simple example of checkpoint–restart.
Acknowledgments
This work was partially supported by grants from the National Science Foundation (OCI-0960978) and the Intel Corporation.
Appendix A.: Background of DMTCP
DMTCP [1] is a transparent checkpoint–restart package with its roots going back eight years [26]. It works completely in user space and does not require any changes to the application or the operating system. DMTCP can be used to checkpoint a variety of user applications including Python.
DMTCP automatically tracks all local and remote child processes and their relationships.
As seen in figure A1 , a computation running under DMTCP consists of a centralized coordinator process and several user processes. The user processes may be local or distributed. User processes may communicate with each other using sockets, shared-memory, pseudo-terminals, etc. Further, each user process has a checkpoint thread that communicates with the coordinator.
Figure A1. Architecture of DMTCP.
Download figure:
Standard image High-resolution imageA.1. DMTCP plugins
DMTCP plugins are used to keep DMTCP modular. There is a separate plugin for each operating system resource. Examples of plugins are pid plugin, socket plugin, and file plugin. Plugins are responsible for checkpointing and restoring the state of their corresponding resources.
The execution environment can change between checkpoint and restart. For example, the computation might be restarted on a different computer which has different file mount points, a different network address, etc. Plugins handle such changes in the execution environment by virtualizing these aspects. DMTCP provides a short tutorial for writers of third-party plugins [10].
A.2. DMTCP coordinator
DMTCP uses a stateless centralized process, the DMTCP coordinator, to synchronize checkpoint and restart between distributed processes. The user interacts with the coordinator through the console to initiate checkpoint, check the status of the computation, kill the computation, etc. It is also possible to run the coordinator as a daemon process, in which case the user may communicate with the coordinator using the command 'dmtcp_command'.
A.3. Checkpoint thread
The checkpoint thread waits for a checkpoint request from the DMTCP coordinator. On receiving the checkpoint request, the checkpoint thread quiesces the user threads and creates the checkpoint image. To quiesce user threads, it installs a signal handler for a dedicated POSIX signal (by default, SIGUSR2). Once the checkpoint image has been created, the user threads are allowed to resume executing application code. Similarly, during restart, once the process memory has been restored, the user threads can resume executing application code.
A.4. Checkpoint
On receiving the checkpoint request from the coordinator, the checkpoint thread sends the checkpoint signal to all the user threads of the process. This quiesces the user threads by forcing them to block inside a signal handler, defined by the DMTCP. The checkpoint image is created by writing all of user-space memory to a checkpoint image file. Each process has its own checkpoint image. Prior to checkpoint, each plugin will have copied into user-space memory any kernel state associated with its concerns. Examples of such concerns include network sockets, files, and pseudo-terminals. Once the checkpoint image has been created, the checkpoint thread 'un-quiesces' the user threads and they resume executing application code.
At the time of checkpoint, all of user-space memory is written to a checkpoint image file. The user threads are then allowed to resume execution. Note that user-space memory includes all of the run-time libraries (libc, libpthread, etc), which are also saved in the checkpoint image.
In some cases, state outside the kernel must be saved. For example, in handling network sockets, data in flight must be saved. This is done by draining the network data by sending a special cookie through the 'send' end of each socket in one phase. In a second phase, after a global barrier, data is read from the 'receive' end of each socket until the special cookie is received. The in-flight data has now been copied into user-space memory, and so will be included in the checkpoint image. On restart, the network buffers are refilled by sending the in-flight data back to the peer process, which then sends the data back into the network.
A.5. Restart
As the first step of restart phase, all memory areas of the process are restored. Next, the user threads are recreated. The plugins then receive the restart notification and restore their underlying resources, translation tables, etc. Finally, the checkpoint thread 'un-quiesces' the user threads and the user threads resume executing application code.
Appendix B.: Installing and using DMTCP/FReD
DMTCP: In order to use FReD, one needs to download and build DMTCP as follows:
# tar xvf dmtcp-x.y.tar.gz |
# cd dmtcp-x.y |
# ./configure |
# make |
# make install (optional) |
FReD: Finally, FReD can be downloaded and built as follows:
# git clone git@github.com:fred-dbg/fred.git |
# cd fred |
# (cd record-replay; ./configure --with-dmtcp-root=DMTCP_ROOT; make) |
(where DMTCP_ROOT is the root directory of DMTCP) |
# make check (optional) |