Unix/Linux Go Back    


War Stories Tell your work related tech stories and share experiences here. Share your successes and failures and other "war stories" in this forum.

Interesting script issue clubbed with crontab.

War Stories


Tags
cron, issue, migration, script, unix

Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 10-21-2017   -   Original Discussion by RavinderSingh13
RavinderSingh13's Unix or Linux Image
RavinderSingh13 RavinderSingh13 is offline Forum Advisor  
Registered User
 
Join Date: May 2013
Last Activity: 12 December 2017, 4:30 AM EST
Location: Chennai
Posts: 2,675
Thanks: 590
Thanked 1,275 Times in 1,146 Posts
Interesting script issue clubbed with crontab.

Hello All,

Finally I am posting an issue and it's solution which I faced last week. Let me explain it by headings.

Issue's background: It was a nice Tuesday for me, went to office as usual started checking emails and work assigned to me. Suddenly a gentleman reached out to me on my desk(in a horrified and terrified condition). He is from another team and would have listen my name from one of our common friend. ASAP he reached out to my seat; after a warm, quick and simple handshake he immediately started telling that he is facing a PROD issue. OK, so since I heard the word "PRODUCTION" so my ears started working more actively now.

Now he is explaining me like a new OP in UNIX & LINUX forums Linux eg-> script was working before a server's migration and not working after migration to a new server(that was the only main sentence I got from him)

Now I taken him to free meeting room and requested him to explain me things there(He again explained me the same thing but this time he showed me his previous server script's output and script). I taken a deep breath and started understanding their workflow(which tells a person very high level view of what is happening with their script(s)). After understanding their workflow I started looking their old server(but seems their old server was on decommission request so few of the things already gone from there(a bit access, mount points etc), I will let you know the crispy part of it later Linux).

At this time I had how their output looks, so now as a troubleshooting part I started my journey of fixing it.

Journey of fixing script: First of first I asked them if All is Well in QA environment and answer was "YES", then I verified QA server myself and yes everything was working. Initially I decided not to compare scripts as they are many in numbers. So first thing came into my mind since that person told server migration is, are they having same filesystem names or user names which they were using in previous server? After checking servers seems to be they were changed.

Now I used few find commands eg--> find -type f -exec grep -l "old_user" {} \+ and find -type f -exec grep -l "old_path" {} \+. Guess what results were shocking these guys have not changed paths in new place(since we were already in an issue so their people allowed me to do changes with a backup of all things off course). So now I changed their values successfully in all of the places, since there were many scripts(being related to each other) so made this change to everyplace.

Time had come to run the script manually and guess what that fixed passed with flying colors Linux. I was on "seventh heaven", since script is doing many different tasks on DB and content mgmt. level so it took almost 2 hours to complete, once they verified that All is Well I requested them to schedule it(by whichever way they want by Jenkins or by cron etc), they told me they checked old prod server and crontab entries are gone(may be affect of their decomm request, the above mentioned crispy part Linux). So I asked them let us check in QA environment(at this point of time I really lost faith that their QA things are in sync with PROD, though I still thought to give it a shot to check there). Yes, entries were there so I got to know they want it to run every day on a specific time, then I have put simple cron entry eg--> 12 12 * * * /actual/path/of/script.ksh.

I asked them to check(I intestinally set up cron job to run after 30 mins, thought could for lunch and after that will see if All is Well).

After lunch I got the news that script didn't kick off, I was surprised as crontab entry was perfect and while checking the logs found that cron kicked it off but NO other logs. Then I have setup set -x in the starting of script and scheduled it again because their was NOT at all logging anything specially there was no error handling at all. Once cron picked it up next time too, NO errors shown up.

I was sure something fishy again could be related to OLD server's references etc. So now I started comparing QA and PROD scripts and I was in huge shock when I saw they were like 70 to 80% different(though their logics seems to be same and trust me their QA script was much better than PROD). I got to know till this point that I am on my own now.

I started reading very first script now, which was calling almost 5 to 6 more scripts(till then I asked that person to take back your decomm request for that server so that we could get some more information from it). While checking scripts I saw there were many relative paths were there(NO absolute paths were most of the times). Then I suspected this could be the culprit and I have put multiple pwd commands, specially wherever their custom jars(java code archives were getting called).

Believe me or not I was shocked for almost 2 mins to see results their pwd value was NOT at all changing(which they claimed that in OLD server these relative paths worked because I believe when they come out side of jar 's working somehow their working directory was getting set but in this server this was not happening with cron, which I came to know cron's never export their full paths of DOT profile), I was HAPPY that I found it out, so ASAP I find out I have changed all the relative paths from ../../bla/bla/bla to actual/path/bla/bla, it took some time to change them because there were many paths(I proposed them to write some loggings in script and most important create a variable file also now because in future they need not to change any script(s) for paths etc, which they may be working I guess so Linux).

Good time has come to run the script again by crontab and when I setup to run after 2 mins, script has run successfully and things were gong well.

Leanings: There were lot of learning points out of this episode:
  • The BEST one for me will be our QA and PROD environments should be always sync.
  • NEVER EVER decomm. a server without confirming that new things are going well for sure.
  • Be very careful with relative paths as in cron it could be tricky.

Thought to share this with you folks. Would like to know your views/comments(if any), keep learning and keep sharing knowledge Linux


Thanks,
R. Singh
Sponsored Links
    #2  
Old Unix and Linux 10-22-2017   -   Original Discussion by RavinderSingh13
Peasant's Unix or Linux Image
Peasant Peasant is offline Forum Advisor  
Registered User
 
Join Date: Mar 2011
Last Activity: 12 December 2017, 12:09 AM EST
Posts: 1,099
Thanks: 31
Thanked 330 Times in 285 Posts
I stopped using relative paths and find long time ago (unless doing interactive work).

This approach is much safer and easier to maintain :

Code:
cd $ABSOLUTE_PATH_DIR && find . <further options and operands>   || exit 1
cd -

This way, an error if directory does not exist or permission is denied is printed on stderr.
You will see this error in local mail if running in crontab (if no stderr redirection has been made inside crontab line).

Do not use trailing slashes with directory variables.
Example of things going haywire when doing that :

Code:
set -x
#DIR=/home/user # someone decided to comment this line or makes a mistake.
# more lines of code, hundreds of them
cd $DIR/ && find . -type f .. || exit 1 # This will expand into cd / && find .. # enough said

Decommission -> create a virtual machine out of it - if you can, not some obscure OS/hardware Linux

Good things to do before decommission are (if you cannot virtualize it in lab) :
  1. Copy all crontab and at entries and related scripts used in those.
  2. Issue a mount, output into a file and copy it somewhere safe.
  3. Issue a share into a file (if NFS or other network file systems are exported on the box).
  4. FC and LAN topology should be written down (WWNS, lan port configuration etc.)
  5. /etc/passwd, /etc/shadow, /etc/group, /etc/hosts (perhaps more depending) files should be copied.

All above is done in couple of minutes or less, and is golden for post mortem analysis.


Hope that helps
Regards
Peasant.

Last edited by rbatte1; 10-23-2017 at 07:34 AM.. Reason: Converted to formatted numbered list with LIST=1 tags
The Following User Says Thank You to Peasant For This Useful Post:
rbatte1 (10-23-2017)
Sponsored Links
    #3  
Old Unix and Linux 11-03-2017   -   Original Discussion by RavinderSingh13
bakunin's Unix or Linux Image
bakunin bakunin is offline Forum Staff  
Bughunter Extraordinaire
 
Join Date: May 2005
Last Activity: 10 December 2017, 6:03 PM EST
Location: In the leftmost byte of /dev/kmem
Posts: 5,647
Thanks: 109
Thanked 1,613 Times in 1,184 Posts
Quote:
Originally Posted by RavinderSingh13 View Post
find -type f -exec grep -l "old_user" {} \+
A trick i have learned here (and i am ashamed i can't remember from who) is to always add a second file to grep when using it this way.

grep, when called with a single file, will not show the file name where it found something:


Code:
find /some/where -type f -exec grep "bla foo" {} \;
whatever bla foo something
another hit bla foo
....

But write it like this:

Code:
find /some/where -type f -exec grep "bla foo" /dev/null {} \;

and grep will add the file names of the files at the beginning of the line.

bakunin
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Issue in running shell script in crontab Rajkumar Gandhi Shell Programming and Scripting 1 05-08-2013 04:35 AM
issue with running script with crontab mad_man12 Shell Programming and Scripting 7 08-27-2009 04:12 AM
Interesting issue with pthread_mutex_lock and siglongjmp in AIX 5.3 (and no other OS) DreamWarrior Programming 1 06-14-2009 12:07 AM
Report filtering - Weird issue and interesting - UrgentPlease ajilesh Shell Programming and Scripting 2 03-11-2009 12:56 PM
Facing issue in Solaris OS in crontab for running shell script mabrar Shell Programming and Scripting 2 11-02-2007 07:32 AM



All times are GMT -4. The time now is 06:14 AM.