Restore Socket after checkpoint


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Restore Socket after checkpoint
# 1  
Old 09-24-2012
Restore Socket after checkpoint

Hello,

i have done the checkpoint of an application client server in C with BLCR (Berkeley Lab checkpoint restart), after a failure, i'd like to restart server (server.blcr) and client (client.blcr) but i should recreate sockets betwen new client and new server, have you an idea please ?

Thank you so much.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. IP Networking

Clarification - Setting socket options at the same time when socket is listening

I need clarification on whether it is okay to set socket options on a listening socket simultaneously when it is being used in an accept() call? Following is the scenario:- -- Task 1 - is executing in a loop - polling a listen socket, lets call it 'fd', (whose file descriptor is global)... (2 Replies)
Discussion started by: jake24
2 Replies

2. Programming

Error with socket operation on non-socket

Dear Experts, i am compiling my code in suse 4.1 which is compiling fine, but at runtime it is showing me for socket programming error no 88 as i searched in errno.h it is telling me socket operation on non socket, what is the meaning of this , how to deal with this error , please... (1 Reply)
Discussion started by: vin_pll
1 Replies

3. Programming

socket function to read a webpage (socket.h)

Why does this socket function only read the first 1440 chars of the stream. Why not the whole stream ? I checked it with gdm and valgrind and everything seems correct... #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <string.h> #include... (3 Replies)
Discussion started by: cyler
3 Replies

4. Programming

which socket should socket option on be set

Hi all, On the server side, one socket is used for listening, the others are used for communicating with the client. My question is: if i want to set option for socket, which socket should be set on? If either can be set, what's the different? Again, what's the different if set option... (1 Reply)
Discussion started by: blademan100
1 Replies

5. UNIX for Advanced & Expert Users

connect problem for sctp socket (ipv6 socket) - Runtime fail Invalid Arguments

Hi, I was porting ipv4 application to ipv6; i was done with TCP transports. Now i am facing problem with SCTp transport at runtime. To test SCTP transport I am using following server and client socket programs. Server program runs fine, but client program fails giving Invalid Arguments for... (0 Replies)
Discussion started by: chandrutiptur
0 Replies

6. AIX

mksysb restore - Wrong OS level for restore

Hi all, I am still working on my mksysb restore. My latest issue is during an alt_disk_install from tape I got the following error after all the data had been restored. 0505-143 alt_disk_install: Unable to match mksysb level 5.2.0 with any available boot images. Please correct this... (0 Replies)
Discussion started by: pobman
0 Replies
Login or Register to Ask a Question
OPAL_CRS(7)							     Open MPI							       OPAL_CRS(7)

NAME
OPAL_CRS - Open PAL MCA Checkpoint/Restart Service (CRS): Overview of Open PAL's CRS framework, and selected modules. Open MPI 1.4.5. DESCRIPTION
Open PAL can involuntarily checkpoint and restart sequential programs. Doing so requires that Open PAL was compiled with thread support and that the back-end checkpointing systems are available at run-time. Phases of Checkpoint / Restart Open PAL defines three phases for checkpoint / restart support in a procress: Checkpoint When the checkpoint request arrives, the procress is notified of the request before the checkpoint is taken. Continue After a checkpoint has successfully completed, the same process as the checkpoint is notified of its successful continuation of execu- tion. Restart After a checkpoint has successfully completed, a new / restarted process is notified of its successful restart. The Continue and Restart phases are identical except for the process in which they are invoked. The Continue phase is invoked in the same process as the Checkpoint phase was invoked. The Restart phase is only invoked in newly restarted processes. GENERAL PROCESS REQUIREMENTS
In order for a process to use the Open PAL CRS components it must adhear to a few programmatic requirements. First, the program must call OPAL_INIT early in its execution. This should only be called once, and it is not possible to checkpoint the process without it first having called this function. The program must call OPAL_FINALIZE before termination. This does a significant amount of cleanup. If it is not called, then it is very likely that remnants are left in the filesystem. To checkpoint and restart a process you must use the Open PAL tools to do so. Using the backend checkpointer's checkpoint and restart tools will lead to undefined behavior. To checkpoint a process use opal_checkpoint (opal_checkpoint(1)). To restart a process use opal_restart (opal_restart(1)). AVAILABLE COMPONENTS
Open PAL ships with two CRS components: self and blcr. The following MCA parameters apply to all components: crs_base_verbose Set the verbosity level for all components. Default is 0, or silent except on error. crs_base_snapshot_dir The directory to store the checkpoint snapshots. Default is /tmp. self CRS Component The self component invokes user-defined functions to save and restore checkpoints. It is simply a mechanism for user-defined functions to be invoked at Open PAL's Checkpoint, Continue, and Restart phases. Hence, the only data that is saved during the checkpoint is what is written in the user's checkpoint function. No libary state is saved at all. As such, the model for the self component is slightly differnt than for other components. Specifically, the Restart function is not invoked in the same process image of the process that was checkpointed. The Restart phase is invoked during OPAL_INIT of the new instance of the applicaiton (i.e., it starts over from main()). The self component has the following MCA parameters: crs_self_prefix Speficy a string prefix for the name of the checkpoint, continue, and restart functions that Open PAL will invoke during the respective stages. That is, by specifying "-mca crs_self_prefix foo" means that Open PAL expects to find three functions at run-time: int foo_checkpoint() int foo_continue() int foo_restart() By default, the prefix is set to "opal_crs_self_user". crs_self_priority Set the self components default priority crs_self_verbose Set the verbosity level. Default is 0, or silent except on error. crs_self_do_restart This is mostly internally used. A general user should never need to set this value. This is set to non-0 when a the new process should invoke the restart callback in OPAL_INIT. Default is 0, or normal execution. blcr CRS Component The Berkeley Lab Checkpoint/Restart (BLCR) single-process checkpoint is a software system developed at Lawrence Berkeley National Labora- tory. See the project website for more details: http://ftg.lbl.gov/CheckpointRestart/CheckpointRestart.shtml The blcr component has the following MCA parameters: crs_blcr_priority Set the blcr components default priority. crs_blcr_verbose Set the verbosity level. Default is 0, or silent except on error. none CRS Component The none component simply selects no CRS component. All of the CRS function calls return immediately with OPAL_SUCCESS. This component is the last component to be selected by default. This means that if another component is available, and the none component was not explicity requested then OPAL will attempt to activate all of the available components before falling back to this component. SEE ALSO
opal_checkpoint(1), opal_restart(1) 1.4.5 Feb 10, 2012 OPAL_CRS(7)