Issues in reading file using 'awk'


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Issues in reading file using 'awk'
# 1  
Old 03-16-2013
Issues in reading file using 'awk'

Dear all,
I am using following function of some script to assign variable "JobNo" some value form file $SAMPLE"_status.log" [1] ( generated using the red color command )
Code:
  
   crab ntuplize_crab -status -c $SAMPLE >& $SAMPLE"_status.log" &  
   echo $SAMPLE"_status.log" "====="  
   jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' $SAMPLE"_status.log" )
    #sleep 200                                                                                                                                        

    echo $jobNo "====="
    echo $jobNo "====="

The name of the file is correctly printed on the screen and also I checked the content is fine which is [1].
Now, the execution of this script pass me the following output:
Code:
 
qcd120_status.log =====
=====
=====

The blue is the file name and is fine. But when I am trying to print the JobNo it only print the "===="..
And when I use the above command on the terminal it is passing me the proper JobNo which I want, should be following:
Code:
 57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387

Please help, I am completely stuck.

[1]
Code:
 crab:  ExitCodes Summary
 >>>>>>>>> 309 Jobs with Wrapper Exit Code : 0
         List of jobs: 1,4-10,12-27,29-44,46-48,50-56,58-66,69-79,81,83-86,88-101,105-106,109-110,113,115-116,119-122,126,133,135-138,140-147,158,162\
-168,173,175-178,181,186-199,201,203,205-207,211-218,220-225,227-235,237,239-244,246-250,252,258-261,263-264,266-270,272-279,281-287,289-307,309-329,\
332,334-335,337-347,349-352,354,356-374,376,378-380,382-384,386,388-401
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:  ExitCodes Summary
 >>>>>>>>> 4 Jobs with Wrapper Exit Code : 8028
         List of jobs: 57,331,333,336
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:  ExitCodes Summary
 >>>>>>>>> 1 Jobs with Wrapper Exit Code : 8021
         List of jobs: 348
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:  ExitCodes Summary
 >>>>>>>>> 87 Jobs with Wrapper Exit Code : 60307
         List of jobs: 2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180\
,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387
        See https://twiki.cern.ch/twiki/bin/view/CMS/JobExitCodes for Exit Code meaning

crab:   401 Total Jobs

# 2  
Old 03-16-2013
Hi,
  • Is qcd120_status.log the actual name of the log that your posted under [1]?
  • Does the log file contain \ at the end of some of the lines or did you put those there yourself?
# 3  
Old 03-16-2013
Hi,
Thanks for the reply,
Yes, it is the actual name of the log that I posted at [1].
ummm, I guess it the the symbol for the starting of the next line. Cos the content was not enough to come on the single line.

Besides, using the command on the terminal give the proper 'jobNo' on the same qcd12_status.log file.

emily

Quote:
Originally Posted by Scrutinizer
Hi,
  • Is qcd120_status.log the actual name of the log that your posted under [1]?
  • Does the log file contain \ at the end of some of the lines or did you put those there yourself?
# 4  
Old 03-16-2013
Quote:
Originally Posted by emily
Hi,
Thanks for the reply,
Yes, it is the actual name of the log that I posted at [1].
ummm, I guess it the the symbol for the starting of the next line. Cos the content was not enough to come on the single line.

Besides, using the command on the terminal give the proper 'jobNo' on the same qcd12_status.log file.

emily
I am guessing that we have a small language barrier in this discussion.

The output that you said you were getting when you run the command manually has a leading space that the awk script you showed us would not produce.
The output that you said you were getting when you run the command manually also contains text from the continuation line in your log file that your awk script does not handle.

When I run the awk script you provided with the input data you provided, the output produced is:
Code:
57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180\

not:
Code:
 57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387

But, we are using awk after the command that produced the input file is long gone. Since your script is running crab to produce the file being read by awk asynchronously in the background while awk is running in the foreground, there is a good chance that awk will hit end of file before crab writes anything into the file. If this happens, obviously jobNo will be set to an empty string.

If you get rid of the ampersand (&) at the end of the crab command line and if crab does not split long "List of jobs" lines with backslashes (\) and follow them with the continuation lines that you showed in your 1st message in this thread, everything should work as you expect it to work.
# 5  
Old 03-16-2013
Hi Dan,
Thanks for looking into it..but I am confuse too..Let me rephase my trouble again...

When I run this command manually here is the response:
Code:
 

[emily04@cmslpc38 pythia]$ jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' qcd120_status.log )
[emily04@cmslpc38 pythia]$ echo $jobNo
57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387
[pooja04@cmslpc38 pythia]$

Which I want from the SCRIPT too.

And for me, script is giving me nothing for the JobNo variable. What it rather pass me as output is:
Code:
 
---------Will Resubmit the Jobs--------------
qcd120_status.log =====
=====
=====


And again, the function is defined as following in the script:
Code:
ResubmitJobs() {
 crab ntuplize_crab -status -c $SAMPLE >& $SAMPLE"_status.log" &
  echo "---------Will Resubmit the Jobs--------------"
                                       
    echo $SAMPLE"_status.log" "====="
    
    jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' $SAMPLE"_status.log" )
    #sleep 200                                                                                                                                        

    echo $jobNo "====="
    echo $jobNo "====="

I hope it is easy for you now to understand it.

greetings,
emily
# 6  
Old 03-16-2013
Quote:
Originally Posted by emily
Hi Dan,
Thanks for looking into it..but I am confuse too..Let me rephase my trouble again...

When I run this command manually here is the response:
Code:
 

[emily04@cmslpc38 pythia]$ jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' qcd120_status.log )
[emily04@cmslpc38 pythia]$ echo $jobNo
57,331,333,336,348,2-3,11,28,45,49,67-68,80,82,87,102-104,107-108,111-112,114,117-118,123-125,127-132,134,139,148-157,159-161,169-172,174,179-180,182-185,200,202,204,208-210,219,226,236,238,245,251,253-257,262,265,271,280,288,308,330,353,355,375,377,381,385,387
[pooja04@cmslpc38 pythia]$

Which I want from the SCRIPT too.
Hi Emily.

I will assume you meant "Don" rather than "Dan".

Yes, I understand the output you want.
Quote:
Originally Posted by emily
And for me, script is giving me nothing for the JobNo variable. What it rather pass me as output is:
Code:
 
---------Will Resubmit the Jobs--------------
qcd120_status.log =====
=====
=====


And again, the function is defined as following in the script:
Code:
ResubmitJobs() {
 crab ntuplize_crab -status -c $SAMPLE >& $SAMPLE"_status.log" &   <--- Remove this ampersand!
  echo "---------Will Resubmit the Jobs--------------"
                                       
    echo $SAMPLE"_status.log" "====="
    
    jobNo=$(awk '/Jobs with Wrapper/ && $NF != 0{s=1}   /List of jobs/ && s{if(p){p=p","$NF}else{p=$NF};s=""}END{print p}' $SAMPLE"_status.log" )
    #sleep 200                                                                                                                                        

    echo $jobNo "====="
    echo $jobNo "====="

I hope it is easy for you now to understand it.

greetings,
emily
Yes, I understand. And, as I said before, if you remove the ampersand marked in magenta above, you will get the output you want. Your problem is that awk is processing $SAMPLE"_status.log" before the crab command writes any data into it. You are running crab and awk concurrently instead of letting crab complete before letting awk read the data that crab will eventually produce.
# 7  
Old 03-16-2013
Hi Don,
Yup, it is working..Thanks Don...Smilie Smilie

May I ask another query, which is following:
I want the script to look for all directories within that particular directory and perform
crab task , get the jobNo...
At present, I have following directory where I want to perform these operation:
Code:
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:42 qcd800
-rw-r--r-- 1 emily04 us_cms   9739 Mar 15 11:42 VgAnalyzerKitDemoMC52X_AOD.pyc
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:43 qcdEm40
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:44 GJet20To40
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:46 qcd1000
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:47 qcd120
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:48 GJet40ToInf
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:49 qcd1400
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:51 qcd50
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:52 qcd1800
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:53 qcd30
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:54 qcd80
drwxr-xr-x 6 emily04 us_cms   2048 Mar 15 11:55 qcdEm30To40

But also,I am afraid sometime they does not have any parriculer pattern, for example another set of dorectories which
i have are following:
Code:
 
drwxr-xr-x 6 emily4 us_cms  2048 Mar 15 10:32 tt
drwxr-xr-x 6 emily04 us_cms  2048 Mar 15 10:33 zgamma
drwxr-xr-x 6 emily04 us_cms  2048 Mar 15 10:34 DiPhoJet
drwxr-xr-x 6 emily04 us_cms  2048 Mar 15 10:35 DYJets50

Can I define some kind of 'array' declaring the directories name and 'loop' to run over them?

Thanks in advance.
emily

---------- Post updated at 11:34 AM ---------- Previous update was at 10:01 AM ----------

Hi again,
I could perform the array based execution of the commands. Thanks all
for your kind help.

What I did is following:
Code:
GREP="qcd30"
GREP=""QCD50"
for file in "${GREP[@]}"
do
      crab ntuplize_crab -getoutput -c $FileNameIndx
done

But while doing this, it come to my mind if I can perform parallel execution of the different GREP[] ?
Is it doable?

greetings,
emily

Last edited by emily; 03-16-2013 at 12:11 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk: File Checking Issues with 9 multiple file

Hi, I have 9 files which are generated dynamically & if there is a some condition which doesn't meet the criteria then file is not created or is of zero size. so further i am unable to consolidate the files based on following code 1 awk -F, -v ptime="201407" 'FNR==1... (3 Replies)
Discussion started by: siramitsharma
3 Replies

2. Shell Programming and Scripting

Reading data from file using awk

I have a file as below. It contains two data sets separated by >. I want to pipe each data set to another program called psxy. How can I get the different records Have started doing as follows but it only passes the first data set awk 'BEGIN {RS=">"};{print $0}' p.dat cat p.dat... (12 Replies)
Discussion started by: kristinu
12 Replies

3. Shell Programming and Scripting

awk issue while reading from file in while do

Hi Friends, I am trying to scan line by line using awk and pull the values and pass it in variables and then will use the variables but doesn't work. Please see below for details. #more dbtest.sh ---------------------------------- #!/bin/bash . $HOME/.bash_profile while read line do... (6 Replies)
Discussion started by: narunice
6 Replies

4. Shell Programming and Scripting

reading file awk or while

While read line query !!! Folks, I am working on a file which has entries as follows. I am using while read line to generate desired output as follows. filename1: Name : sdt2156157_ID NOS : 4567 NOS : 2348 Name : sdt2156158_ID NOS : 4987 NOS :... (3 Replies)
Discussion started by: dynamax
3 Replies

5. Shell Programming and Scripting

awk file reading doubt

Hi, Using this trivial code, I am trying to insert/paste the single column data of a file into the second column (field 2) of a multi-column text file. awk 'FNR==NR {a=$0; next} {$1=$1 OFS a}1' single-column-file multi-column-file Lets consider the single-column-file as 'f2' and multi-column... (1 Reply)
Discussion started by: royalibrahim
1 Replies

6. Shell Programming and Scripting

awk- reading input file twice

Hello, I've been trying to come up with a solution for the following problem; I have an input file with two columns and I want to print as an output the first column without any changes but for the second column, I want to divide it by its last value. Example input: 1 9 2 10 3 11 4 12 5... (14 Replies)
Discussion started by: acsg
14 Replies

7. Shell Programming and Scripting

Issues with Reading Line by line from a file

I am trying to read a host name one at a time from a file which has a list of hostnames and do rsh and print . its not looping through the entire file. its breaking out after the first entry. If i comment out the rsh then it loops through file #!/bin/ksh filename="/tmp/hostnames"; while read -r... (11 Replies)
Discussion started by: SMunje
11 Replies

8. Shell Programming and Scripting

Reading a file several times with awk

Hi everyone, I was wondering if it's possible to read a file ("file2" in my example) more than once. In this example I want to print file2 entirely for each lines of file1: awk -F$'\t' '{ print $0 while ((getline < "file2") > 0) { print "\t"$0 } }' file1 It... (4 Replies)
Discussion started by: anthony.cros
4 Replies

9. Shell Programming and Scripting

Using awk to when reading a file to search and output to file

Hi, I am not sure if this will work or not. I am getting a syntax error. I am reading fileA, using an acct number field trying to see if it exists in fileB and output to new file. Can anyone tell me if what I am doing will work or should I attempt it another way? Thanks. exec < "${fileA}... (4 Replies)
Discussion started by: ski
4 Replies

10. Shell Programming and Scripting

Reading large file, awk and cut

Hello all, I have 2 files, the first (indexFile1) contains start offset and length for each record inside the second file. The second file can be very large, each actual record start offset and length is defined by the entry in indexFile1. Since there are no records separators wc-l returns 0 for... (1 Reply)
Discussion started by: gio001
1 Replies
Login or Register to Ask a Question