Sponsored Content
Full Discussion: Trace / Debug Howto?
Top Forums UNIX for Advanced & Expert Users Trace / Debug Howto? Post 302907040 by bbq on Wednesday 25th of June 2014 01:21:59 PM
Old 06-25-2014
Tricky

I was more interested in finding some online tutorials on how to trace into a hanging job effectively...

But as you ask:

The script is running jobs over ssh on remote servers.

A file based messaging service is used for communication between the remote server and the master.

The script seems to hang at a point where it is waiting for a msg containing the word COMPLETED.

This might be as easy as checking for a corrupt msg file the next time it hangs. But I am really hoping for some more generic tips-and-tricks type answers.

Code:
m_process_msg_queue()
{
    local TMPFILE1="$(m_get_tmp_file ${FUNCNAME})" LINE CTL_FILE STATUS PIPE PID JOB_FILE TO=${C_MSG_PIPE_TO}
    [[ -f ${TMPFILE1} ]] || m_fail 1 "Error: Failed to create tmp file (${FUNCNAME})" 

    m_check_file -frw "${M_MSG_QUEUE}" s || m_fail 1 "Error: Msg queue not found (${FUNCNAME})" 

    while read LINE
    do
        #======================================
        # Split the line and get the control file
        #   and the status field
        #======================================
        CTL_FILE="$(echo ${LINE} | cut -d"|" -f1)"
        STATUS="$(echo ${LINE} | cut -d"|" -f2 | awk '{print $NF}' FS="Status=" )"
        
        [[ (-n ${STATUS}) && (-n ${CTL_FILE}) ]] || 
            m_fail 1 "Error: Failed to parse msg ctl file (${FUNCNAME})" 
        PIPE=${CTL_FILE##*/}

        JOB_FILE="$(sed -n '/^JobFile:/p' "${CTL_FILE}" | cut -d":" -f 2)"
        [[ -n ${JOB_FILE} ]] || m_fail 1 "Error: Failed to retrieve job ctl file (${FUNCNAME})" 
        m_check_file -frw "${JOB_FILE}" s || m_fail 1 "Error: job ctl validation failure (${FUNCNAME})" 

        PID="$(sed -n '/^Pid:/p' "${CTL_FILE}" | cut -d":" -f2)"
        [[ ${PID} =~ ^[[:digit:]]+$ ]] || m_fail 1 "Error: PID validation (${FUNCNAME})" 
        m_write_job_field ${C_JOB_PID} "${PID}" "${JOB_FILE}"

        case ${STATUS} in
            "COMPLETED")
                #======================================
                # Completed. Nothing to do.
                #======================================
                m_close_pipe "${FUNCNAME}" "${PIPE}" "${TO}" "${PID}"
                ;;
            "FAILED")
                #======================================
                # Log the error in the run log
                #======================================
                m_log_msg "Non FATAL error in (${CTL_FILE})"
                m_close_pipe "${FUNCNAME}" "${PIPE}" "${TO}" "${PID}"
                ;;
                
            "FATAL")
                #======================================
                    # Remote job flags a fatal error
                # Don't launch any more jobs
                # Wait for all other jobs to complete
                # Only then throw fatal error in master
                    #======================================
                m_log_msg "FATAL error in (${CTL_FILE})"
                M_HALT_ON_ERROR="true"
                m_close_pipe "${FUNCNAME}" "${PIPE}" "${TO}" "${PID}"
                ;;
            "MANUAL")
                #======================================
                # Manual intervention requested
                # Inform the user
                #======================================
                m_log_msg "Manual request in (${CTL_FILE})"
                M_HALT_ON_ERROR="true"
                m_close_pipe "${FUNCNAME}" "${PIPE}" "${TO}" "${PID}"
                ;;
            *)
                m_log_msg "Unrecognised request (${STATUS}) in (${CTL_FILE})"
                M_HALT_ON_ERROR="true"
                m_close_pipe "${FUNCNAME}" "${PIPE}" "${TO}" "${PID}"
                ;;
        esac

        m_write_job_field ${C_JOB_FINISH} "$(date)" "${JOB_FILE}"

    done < "${M_MSG_QUEUE}"

}

 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Trace connections

In my organization in order for anyone to go to any Unix server they have to go through "SERVER A" and login as themselves. Then people are free to go enywhere they please. For example: SERVER A, loggs in as himself telnets to SERVER B, loggs in as guest telnets to SERVER C, loggs in as... (8 Replies)
Discussion started by: jraitsev
8 Replies

2. IP Networking

trace route ip

hi everybody , i have a solaris 5.6 box and i want to trace the route on an ip i treid traceroute but soalris 5.6 does not support it ... is there a command that can be used equivelent to traceroute ? thanks for your help (2 Replies)
Discussion started by: ppass
2 Replies

3. Shell Programming and Scripting

Function Trace

Does anyone know if there is a util out there to run through a shell script and be able to trace the function call tree. I have inherited some code and the original author was ****mad**** keen on functions - even ones called only once! If anyone knows of anything I would appreciate it - web... (3 Replies)
Discussion started by: ajcannon
3 Replies

4. UNIX for Dummies Questions & Answers

Trace DHCP - Help!

Can someone help me with commands to trace DHCP on an HP_UX box? Thanks! (0 Replies)
Discussion started by: nuGuy
0 Replies

5. HP-UX

how to trace the logs

Hi, Last day, In one of our unix boxes there was an issue wherein few of the directory structures were missing / got deleted. Is there any way by which we can find how it happened, I mean by going through syslog / which user had run what command? Thanks for your help (3 Replies)
Discussion started by: vivek_damodaran
3 Replies

6. Shell Programming and Scripting

how to supress the trace

Hi I am working in ksh and getting the trace after trying to remove the file which in some cases does not exist: $ my_script loadfirm.dta.master: No such file or directory The code inside the script which produces this trace is the following: ] || rm ${FILE}.master >> /dev/null for... (3 Replies)
Discussion started by: aoussenko
3 Replies

7. Solaris

Log Trace

Hi I would like to display only error messages from my log files while monotring application on my solaris box using tail command. Is there other way we can monitor please let me know? In general # tail -f "xyz.log' ---> this will display current activity of the logs, instead i would like... (4 Replies)
Discussion started by: gkrishnag
4 Replies

8. UNIX for Dummies Questions & Answers

Help with trace file

Hi, I am an oracle DBA pretty new to unix. We had one of the filesystems full and a colleague cleared some stuffs to create more space. I just checked now and found there is now more space available. How do i find exactly what he cleared? We have oracle database installed and its a RAC... (4 Replies)
Discussion started by: dollypee
4 Replies

9. Shell Programming and Scripting

Stack Trace

Hi All Thought it would be kind of fun to implement a stack trace for a shell script that calls functions within a sub shell. This is for bash under Linux and probably not portable - #! /bin/bash error_exit() { echo "=======================" echo $1 echo... (4 Replies)
Discussion started by: steadyonabix
4 Replies

10. AIX

Trace su to root

Hi, is it possible to trace everything about user that changes from its own user to root user, failed and successful attempts (I would need user and IP address of user that was trying to do that)? I tried adding auth.notice and auth.info in syslog.conf but it only tracks user withoud IP... (6 Replies)
Discussion started by: sprehodec
6 Replies
All times are GMT -4. The time now is 07:40 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy