Issues in reading file using 'awk' | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Issues in reading file using 'awk'

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 03-16-2013
emily emily is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 25 October 2014, 5:04 PM EDT
Location: Switzerland
Posts: 130
Thanks: 38
Thanked 1 Time in 1 Post
Issues in reading file using 'awk'

Dear all,
I am using following function of some script to assign variable "JobNo" some value form file $SAMPLE"_status.log" [1] ( generated using the red color command )

Code:
  
   crab ntuplize_crab -status -c $SAMPLE >& $SAMPLE"_status.log" &  
   echo $SAMPLE"_status.log" "====="  
   jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' $SAMPLE"_status.log" )
    #sleep 200                                                                                                                                        

    echo $jobNo "====="
    echo $jobNo "====="

The name of the file is correctly printed on the screen and also I checked the content is fine which is [1].
Now, the execution of this script pass me the following output:

Code:
 
qcd120_status.log =====
=====
=====

The blue is the file name and is fine. But when I am trying to print the JobNo it only print the "===="..
And when I use the above command on the terminal it is passing me the proper JobNo which I want, should be following:

Code:
 57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387

Please help, I am completely stuck.

[1]

Code:
 crab:  ExitCodes Summary
 >>>>>>>>> 309 Jobs with Wrapper Exit Code : 0
         List of jobs: 1,4-10,12-27,29-44,46-48,50-56,58-66,69-79,81,83-86,88-101,105-106,109-110,113,115-116,119-122,126,133,135-138,140-147,158,162\
-168,173,175-178,181,186-199,201,203,205-207,211-218,220-225,227-235,237,239-244,246-250,252,258-261,263-264,266-270,272-279,281-287,289-307,309-329,\
332,334-335,337-347,349-352,354,356-374,376,378-380,382-384,386,388-401
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:  ExitCodes Summary
 >>>>>>>>> 4 Jobs with Wrapper Exit Code : 8028
         List of jobs: 57,331,333,336
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:  ExitCodes Summary
 >>>>>>>>> 1 Jobs with Wrapper Exit Code : 8021
         List of jobs: 348
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:  ExitCodes Summary
 >>>>>>>>> 87 Jobs with Wrapper Exit Code : 60307
         List of jobs: 2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180\
,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:   401 Total Jobs

Sponsored Links
    #2  
Old 03-16-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is online now Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 26 October 2014, 5:26 AM EDT
Location: Amsterdam
Posts: 9,552
Thanks: 286
Thanked 2,428 Times in 2,175 Posts
Hi,
  • Is qcd120_status.log the actual name of the log that your posted under [1] ?
  • Does the log file contain \ at the end of some of the lines or did you put those there yourself?
Sponsored Links
    #3  
Old 03-16-2013
emily emily is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 25 October 2014, 5:04 PM EDT
Location: Switzerland
Posts: 130
Thanks: 38
Thanked 1 Time in 1 Post
Hi,
Thanks for the reply,
Yes, it is the actual name of the log that I posted at [1].
ummm, I guess it the the symbol for the starting of the next line. Cos the content was not enough to come on the single line.

Besides, using the command on the terminal give the proper 'jobNo' on the same qcd12_status.log file.

emily

Quote:
Originally Posted by Scrutinizer View Post
Hi,
  • Is qcd120_status.log the actual name of the log that your posted under [1] ?
  • Does the log file contain \ at the end of some of the lines or did you put those there yourself?
    #4  
Old 03-16-2013
Don Cragun's Avatar
Don Cragun Don Cragun is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 26 October 2014, 4:40 AM EDT
Location: San Jose, CA, USA
Posts: 4,904
Thanks: 182
Thanked 1,646 Times in 1,397 Posts
Quote:
Originally Posted by emily View Post
Hi,
Thanks for the reply,
Yes, it is the actual name of the log that I posted at [1].
ummm, I guess it the the symbol for the starting of the next line. Cos the content was not enough to come on the single line.

Besides, using the command on the terminal give the proper 'jobNo' on the same qcd12_status.log file.

emily
I am guessing that we have a small language barrier in this discussion.

The output that you said you were getting when you run the command manually has a leading space that the awk script you showed us would not produce.
The output that you said you were getting when you run the command manually also contains text from the continuation line in your log file that your awk script does not handle.

When I run the awk script you provided with the input data you provided, the output produced is:

Code:
57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180\

not:

Code:
 57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387

But, we are using awk after the command that produced the input file is long gone. Since your script is running crab to produce the file being read by awk asynchronously in the background while awk is running in the foreground, there is a good chance that awk will hit end of file before crab writes anything into the file. If this happens, obviously jobNo will be set to an empty string.

If you get rid of the ampersand ( & ) at the end of the crab command line and if crab does not split long "List of jobs" lines with backslashes ( \ ) and follow them with the continuation lines that you showed in your 1st message in this thread, everything should work as you expect it to work.
Sponsored Links
    #5  
Old 03-16-2013
emily emily is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 25 October 2014, 5:04 PM EDT
Location: Switzerland
Posts: 130
Thanks: 38
Thanked 1 Time in 1 Post
Hi Dan,
Thanks for looking into it..but I am confuse too..Let me rephase my trouble again...

When I run this command manually here is the response:

Code:
 

[emily04@cmslpc38 pythia]$ jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' qcd120_status.log )
[emily04@cmslpc38 pythia]$ echo $jobNo
57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387
[pooja04@cmslpc38 pythia]$

Which I want from the SCRIPT too.

And for me, script is giving me nothing for the JobNo variable. What it rather pass me as output is:

Code:
 
---------Will Resubmit the Jobs--------------
qcd120_status.log =====
=====
=====


And again, the function is defined as following in the script:

Code:
ResubmitJobs() {
 crab ntuplize_crab -status -c $SAMPLE >& $SAMPLE"_status.log" &
  echo "---------Will Resubmit the Jobs--------------"
                                       
    echo $SAMPLE"_status.log" "====="
    
    jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' $SAMPLE"_status.log" )
    #sleep 200                                                                                                                                        

    echo $jobNo "====="
    echo $jobNo "====="

I hope it is easy for you now to understand it.

greetings,
emily
Sponsored Links
    #6  
Old 03-16-2013
Don Cragun's Avatar
Don Cragun Don Cragun is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 26 October 2014, 4:40 AM EDT
Location: San Jose, CA, USA
Posts: 4,904
Thanks: 182
Thanked 1,646 Times in 1,397 Posts
Quote:
Originally Posted by emily View Post
Hi Dan,
Thanks for looking into it..but I am confuse too..Let me rephase my trouble again...

When I run this command manually here is the response:

Code:
 

[emily04@cmslpc38 pythia]$ jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' qcd120_status.log )
[emily04@cmslpc38 pythia]$ echo $jobNo
57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387
[pooja04@cmslpc38 pythia]$

Which I want from the SCRIPT too.
Hi Emily.

I will assume you meant "Don" rather than "Dan".

Yes, I understand the output you want.
Quote:
Originally Posted by emily View Post
And for me, script is giving me nothing for the JobNo variable. What it rather pass me as output is:

Code:
 
---------Will Resubmit the Jobs--------------
qcd120_status.log =====
=====
=====


And again, the function is defined as following in the script:

Code:
ResubmitJobs() {
 crab ntuplize_crab -status -c $SAMPLE >& $SAMPLE"_status.log" &   <--- Remove this ampersand!
  echo "---------Will Resubmit the Jobs--------------"
                                       
    echo $SAMPLE"_status.log" "====="
    
    jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' $SAMPLE"_status.log" )
    #sleep 200                                                                                                                                        

    echo $jobNo "====="
    echo $jobNo "====="

I hope it is easy for you now to understand it.

greetings,
emily
Yes, I understand. And, as I said before, if you remove the ampersand marked in magenta above, you will get the output you want. Your problem is that awk is processing $SAMPLE"_status.log" before the crab command writes any data into it. You are running crab and awk concurrently instead of letting crab complete before letting awk read the data that crab will eventually produce.
Sponsored Links
    #7  
Old 03-16-2013
emily emily is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 25 October 2014, 5:04 PM EDT
Location: Switzerland
Posts: 130
Thanks: 38
Thanked 1 Time in 1 Post
Hi Don,
Yup, it is working..Thanks Don...

May I ask another query, which is following:
I want the script to look for all directories within that particular directory and perform
crab task , get the jobNo...
At present, I have following directory where I want to perform these operation:

Code:
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:42 qcd800
-rw-r--r-- 1 emily04 us_cms   9739 Mar 15 11:42 VgAnalyzerKitDemoMC52X_AOD.pyc
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:43 qcdEm40
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:44 GJet20To40
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:46 qcd1000
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:47 qcd120
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:48 GJet40ToInf
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:49 qcd1400
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:51 qcd50
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:52 qcd1800
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:53 qcd30
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:54 qcd80
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:55 qcdEm30To40

But also,I am afraid sometime they does not have any parriculer pattern, for example another set of dorectories which
i have are following:

Code:
 
drwxr-xr-x 6 emily4 us_cms  2048 Mar 15 10:32 tt
drwxr-xr-x 6 emily04 us_cms  2048 Mar 15 10:33 zgamma
drwxr-xr-x 6 emily04 us_cms  2048 Mar 15 10:34 DiPhoJet
drwxr-xr-x 6 emily04 us_cms  2048 Mar 15 10:35 DYJets50

Can I define some kind of 'array' declaring the directories name and 'loop' to run over them?

Thanks in advance.
emily

---------- Post updated at 11:34 AM ---------- Previous update was at 10:01 AM ----------

Hi again,
I could perform the array based execution of the commands. Thanks all
for your kind help.

What I did is following:

Code:
GREP="qcd30"
GREP=""QCD50"
for file in "${GREP[@]}"
do
      crab ntuplize_crab -getoutput -c $FileNameIndx
done

But while doing this, it come to my mind if I can perform parallel execution of the different GREP[] ?
Is it doable?

greetings,
emily

Last edited by emily; 03-16-2013 at 11:11 AM..
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Reading data from file using awk kristinu Shell Programming and Scripting 12 02-28-2013 07:50 PM
reading file awk or while dynamax Shell Programming and Scripting 3 08-06-2012 05:20 PM
awk- reading input file twice acsg Shell Programming and Scripting 14 04-26-2011 07:57 AM
Issues with Reading Line by line from a file SMunje Shell Programming and Scripting 11 11-11-2010 01:19 PM
Reading a file several times with awk anthony.cros Shell Programming and Scripting 4 04-04-2010 04:44 PM



All times are GMT -4. The time now is 05:28 AM.