process checkpointing


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers process checkpointing
# 1  
Old 08-05-2011
process checkpointing

how process checkpointing is carried out ? Actually i want detail steps to carry out process checkpointing with discription of each and every file included in it like the core dump file,what is ELF header etc.
# 2  
Old 08-05-2011
On firm doing checkpointing wrote a wrapper for the libc so they could know about every write, read, open, close, etc. Usually, it is easier for apps to checkpoint themselves at convenient points. Some processes mini-batch and each batch is committed very quickly, so that work is done and everything else is undone. For instance, insert not update, new file creation not updating files in place.

Good luck!
# 3  
Old 08-06-2011
Actually I want the steps or you can say script to carry out process checkpointing.I just know want is process checkpointing but dont know how to implement it .Can you please explain the methodology for process checkpointing?I mean how it is carried out?
# 4  
Old 08-09-2011
Suppose you are proessing an input file updating a database and making a report. At some point, you record how many input records are processed, how many report lines or bytes are written, and commit your updates in the database. Now you can restart, seeking to the last checkpoint output file offset and writing from there, processing input starting at the saved record count. You only vulnerability is the time from the start of writing the checkpoint to the commit. Discard incomplete restart point records, as the commit has not happened. File writes need to be fsync() to disk at commit time. In fact, the safest place to put the checkpoint is in the db so it does not exist for the future until the commit. If your process has 2 or more db, there is a 2 phase commit, prepare and commit, to minimize the window in time where one db is committed and the second is not.

A lot of the value of an rdbms or a middleware is maintaining transactional boundaries (checkpoints).

Some actions are not repeatable, and to have checkpoints, sometimes a process needs to be redesigned to make all actions capable of roll back to start of transaction.

Sometimes you can design the process so repeated actions due to a restart do not damage the outputs. For instance, for each input record, a row is inserted with a timestamp. If you rerun after inserting for half a file, there are extra rows with older timestamps but no new information. A cleanup process can detect them and remove them. Or, the loader can determine the last insert for the key had identical information, and discard records that do nothing without an insert. In batch, you might do a first phase to find all differences, and a second phase to apply them. If a rerun of an interrupted second phase is necessary, or a the whole file is rerun by accident, there is less or no phase 2 data, as the file is already partly or completely applied.

The transaction or no-checkpoint data in an RDBMS can be handled two ways. As data is modified, new pages are created but old pages are preserved unchanged. At the commit, the database installs the new pages and discards the old pages. While a session is updating pages, it may be the only session which sees the version on the new pages, or it may share that view (nutty idea). A query or session transaction running long on the old pages may be given temporary 'ownership' of them so they are not discarded until a commit or rollback there. Rollback is implicit on exit or disconnect. A process may get killed for owning too man uncommitted pages when doing a query and space runs low or limits are hit, because some other process is committing pages and pushing the old page ownership on them.

Obviously, two processes cannot update the same value in the database, so the second will wait on a lock. If a locks x and then blocks wanting y while b locks y and then blocks wanting x, one process will be killed to remove the deadlock. It helps if everyone locks resources in alphabetical order or the like, so there is no deadlock. For bottlenecks, a process or rdbms schema redesign might be necessary.

Locks can be more common if the rdbms does not handle multi-generational data as I described, but only supports one new and one current page set. The first user to modify the page locks out all others. A user querying the current page locks out any proess desiring changes until it finishes, so the current page stays put. Some DB are granular to the page for locks, others to the row, but too many row or page locks are promoted to a full table lock. Some DB lock out just changes to the table, others lock all access to the table.
# 5  
Old 08-09-2011
actually i want steps to carry out process checkpointing in ubuntu and not anything related to RDBMS checkpointing..Process checkpointing is something related to coredump file and ELF header, with parent and child process. So can you plese explain in relation with coredump files and headers
# 6  
Old 08-09-2011
If a rprogram is really bad, or unlucky, it raises a signal to stop processing, and some signals also generate cor files. Then it is too late for a checkpoint. Checkpoints are put in other files. A child might signal its parent as it passes through each checkpoint. A parent might signal all children demanding a checkpoint, or one process the whole process group.

There are papers and projects out there, but no obvious winner:

unix OR linux OR gnu OR soureforge process checkpoint - Google Search

Esp: https://lists.linux-foundation.org/p...on-CR-0001.pdf

Last edited by DGPickett; 08-09-2011 at 04:21 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Command to get exact tomcat process I am running ignoring other java process

Team, I have multiple batchjobs running in VM, if I do ps -ef |grep java or tomcat I am getting multiple process list. How do I get my exact tomcat process running and that is unique? via shell script? (4 Replies)
Discussion started by: Ghanshyam Ratho
4 Replies

2. Shell Programming and Scripting

Monitoring processes in parallel and process log file after process exits

I am writing a script to kick off a process to gather logs on multiple nodes in parallel using "&". These processes create individual log files. Which I would like to filter and convert in CSV format after they are complete. I am facing following issues: 1. Monitor all Processes parallelly.... (5 Replies)
Discussion started by: shunya
5 Replies

3. UNIX for Advanced & Expert Users

Process remians in Running state causing other similar process to sleep and results to system hang

Hi Experts, I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state. On doing cat /proc/<pid>wchan showing the "__init_begin" in the output. Can you please help me here... (6 Replies)
Discussion started by: naveeng
6 Replies

4. UNIX for Advanced & Expert Users

Process remians in Running state causing other similar process to sleep and results to system hang

Hi Experts, I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state. On doing cat /proc/<pid>wchan showing the "__init_begin" in the output. Can you please help me here... (1 Reply)
Discussion started by: naveeng
1 Replies

5. BSD

Process remians in Running state causing other similar process to sleep and results to system hang

Hi Experts, I am facing one problem here which is one process always stuck in running state which causes the other similar process to sleep state . This causes my system in hanged state. On doing cat /proc/<pid>wchan showing the "__init_begin" in the output. Can you please help me here... (0 Replies)
Discussion started by: naveeng
0 Replies

6. Shell Programming and Scripting

How to put FTP process as a background process/job in perl?

Hi, I am using net::ftp for transferring files now i am trying in the same Linux server as a result ftp is very fast but if the server is other location (remote) then the file transferred will be time consuming. So i want try putting FTP part as a background process. I am unaware how to do... (5 Replies)
Discussion started by: vanitham
5 Replies

7. Shell Programming and Scripting

script to monitor the process system when a process from user takes longer than 15 min run.

get email notification from from system when a process from XXXX user takes longer than 15 min run.Let me know the time estimation for the same. hi ,any one please tell me , how to write a script to get email notification from system when a process from as mentioned above a xxxx user takes... (1 Reply)
Discussion started by: kirankrishna3
1 Replies

8. Shell Programming and Scripting

Shell Script to Kill Process(number of process) Unix/Solaris

Hi Experts, we do have a shell script for Unix Solaris, which will kill all the process manullay, it used to work in my previous env, but now it is throwing this error.. could some one please help me to resolve it This is how we execute the script (and this is the requirement) ... (2 Replies)
Discussion started by: jonnyvic
2 Replies

9. Shell Programming and Scripting

script to monitor process running on server and posting a mail if any process is dead

Hello all, I would be happy if any one could help me with a shell script that would determine all the processes running on a Unix server and post a mail if any of the process is not running or aborted. Thanks in advance Regards, pradeep kulkarni. :mad: (13 Replies)
Discussion started by: pradeepmacha
13 Replies

10. Shell Programming and Scripting

how to start a process and make it sleep for 5 mins and then kill that process

how to start a process and make it sleep for 5 mins and then kill that process (6 Replies)
Discussion started by: shrao
6 Replies
Login or Register to Ask a Question