Help in awk to read the common txt


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help in awk to read the common txt
# 1  
Old 11-16-2012
Help in awk to read the common txt

Dear all,
I have small script which seems to be working but seems to have some bug.
It suppose to read commonTxt and then print the noOfLines in outputFile.
It is working for most of the txt but unable to add some of the variables values.
Can somebody please spend looking at the thread and reply.

Thanks in advance,

Script is following

Code:
#!/bin/bash                                                                                                                                           
date
FileName=DataFileName
LINES4LOG=38
TEXT=Procss

MakeFinalLog() {
    echo '=====Making Final Log ====='
    awk '{if($0~text){p=lines;next}}                                                                                                                 
                                                                                                                                                      
p>0{                                                                                                                                                  
       split($0,arr,"=")                                                                                                                              
       if(!h[arr[1]"HDR"])h[arr[1]"HDR"]=arr[1]                                                                                                       
       a[arr[1]]=a[arr[1]]" "$NF                                                                                                                      
       if(j<lines) b[++j]=arr[1]                                                                                                                      
       sum[arr[1]]+=$NF                                                                                                                               
       p--                                                                                                                                            
}                                                                                                                                                     
END{                                                                                                                                                  
 for(i=1;i<=j;i++)                                                                                                                                    
     print h[b[i]"HDR"]"="a[b[i]]" "sum[b[i]]                                                                                                         
}' lines=$LINES4LOG text="$TEXT" *.list > $FileName"_log.txt"
echo '++++++ Done ++++++  '$FileName"_log.txt"
}

if [ "$1" = "output" ]; then
    SetEnv
    MakeFinalLog
    exit 0
fi

And here is the OutPutFile (DataFileName_log.txt), and as you can see the text in red is not being added up, I am not sure why.

Code:
DataFileName_log.txtSample Count  = 0 0 0
nPU weighted   = 441063 441530 882593
Pass vtx trk   = 442356 442358 884714
GenLevel       = 0 0 0
Pass   HLT     = 442356 442358 884714
   eff = inf 0.00493042 0.812471 inf 0.00483545 0.804582 1.62682
          = === === === === 0
 Pass Spike cut       = 649724 648483 1298207
 Pass Eta cut         = 623956 622832 1246788
 Pass Pt cut          = 185356 184933 370289
Pass EleID     = 60081 59976 120057
Passed DiLep   = 2181 2139 4320
   eff = inf 0.00493042 0.812471 inf 0.00483545 0.804582 1.62682
Pass Z mass    = 1865 1813 3678
 Fail GenEle match = 117 115 232
Passed Zs      = 1772 1721 3493
   eff = inf 0.00493042 0.812471 inf 0.00483545 0.804582 1.62682
Passed ZsW     = 8.10845e+06 8.03766e+06 16146110
Passed ZsW2    = 0 0 0
Sum LepID      = 7.99145e+07 7.76145e+07 157529000
Sum Pileup     = 7.95332e+07 7.88391e+07 158372300
          = === === === === 0
Spike                                      : 2089= 2089 2089
pt                                          : 1281= 1281 1281
pt, eta                                     : 1169= 1169 1169
dR                                          : 1080= 1080 1080
EleVeto                                     : 1029= 1029 1029
EleVeto, HoE                                : 752= 752 752
EleVeto, HoE, sie                           : 197= 197 197
EleVeto, HoE, sie, ChargdHad                : 10= 10 10
EleVeto, HoE, sie, ChargdHad, NeuHad        : 9= 9 9 18
EleVeto, HoE, sie, ChargdHad, NeuHad, phopf : 8= 8 8 16
----------------------------------------------------------= ---------------------------------------------------------- ---------------------------------------------------------- 0
PhoID pass events = 8 8 16
Passed events  = 7 8 15
  eff = 0.00395034 0.00464846 0.0085988
Passed PU evt  = 7.40527 6.24361 13.6489
Passed Yields  = 0.75497 0.636537 1.39151

Here are the two list files that I m trying to read
File1.list

Code:
Procss: TTBb_0
Sample Count  = 0
nPU weighted   = 441063
Pass vtx trk   = 442356
GenLevel       = 0
Pass   HLT     = 442356
   eff = inf
          =====Electron Cut Efficiency ===
 Pass Spike cut       = 649724
 Pass Eta cut         = 623956
 Pass Pt cut          = 185356
Pass EleID     = 60081
Passed DiLep   = 2181
   eff = 0.00493042
Pass Z mass    = 1865
 Fail GenEle match = 117
Passed Zs      = 1772
   eff = 0.812471
Passed ZsW     = 8.10845e+06
Passed ZsW2    = 0
Sum LepID      = 7.99145e+07
Sum Pileup     = 7.95332e+07
          =====Pho Cut Efficiency ===
Spike                                      : 2089
pt                                          : 1281
pt, eta                                     : 1169
dR                                          : 1080
EleVeto                                     : 1029
EleVeto, HoE                                : 752
EleVeto, HoE, sie                           : 197
EleVeto, HoE, sie, ChargdHad                : 10
EleVeto, HoE, sie, ChargdHad, NeuHad        : 9
EleVeto, HoE, sie, ChargdHad, NeuHad, phopf : 8
----------------------------------------------------------
PhoID pass events = 8
Passed events  = 7
  eff = 0.00395034
Passed PU evt  = 7.40527
Passed Yields  = 0.75497

File2.list

Code:
Procss: TTBb_0
Sample Count  = 0
nPU weighted   = 441530
Pass vtx trk   = 442358
GenLevel       = 0
Pass   HLT     = 442358
   eff = inf
          =====Electron Cut Efficiency ===
 Pass Spike cut       = 648483
 Pass Eta cut         = 622832
 Pass Pt cut          = 184933
Pass EleID     = 59976
Passed DiLep   = 2139
   eff = 0.00483545
Pass Z mass    = 1813
 Fail GenEle match = 115
Passed Zs      = 1721
   eff = 0.804582
Passed ZsW     = 8.03766e+06
Passed ZsW2    = 0
Sum LepID      = 7.76145e+07
Sum Pileup     = 7.88391e+07
          =====Pho Cut Efficiency ===
Spike                                      : 2077
pt                                          : 1304
pt, eta                                     : 1198
dR                                          : 1095
EleVeto                                     : 1038
EleVeto, HoE                                : 729
EleVeto, HoE, sie                           : 199
EleVeto, HoE, sie, ChargdHad                : 12
EleVeto, HoE, sie, ChargdHad, NeuHad        : 9
EleVeto, HoE, sie, ChargdHad, NeuHad, phopf : 8
----------------------------------------------------------
PhoID pass events = 8
Passed events  = 8
  eff = 0.00464846
Passed PU evt  = 6.24361
Passed Yields  = 0.636537

# 2  
Old 11-16-2012
1st: Note that you're splitting lines on "=" characters and that the array sum[] is the sum of the numeric value of last field on corresponding lines from each file.

2nd: Note that the lines you've marked in red (plus the two lines following them) have no "=" characters.

If you change the ":" characters on those lines in both input files to "=" characters, you get:
Code:
Sample Count  = 0 0 0
nPU weighted   = 441063 441530 882593
Pass vtx trk   = 442356 442358 884714
GenLevel       = 0 0 0
Pass   HLT     = 442356 442358 884714
   eff = inf 0.00493042 0.812471 inf 0.00483545 0.804582 inf
          = === === === === 0
 Pass Spike cut       = 649724 648483 1298207
 Pass Eta cut         = 623956 622832 1246788
 Pass Pt cut          = 185356 184933 370289
Pass EleID     = 60081 59976 120057
Passed DiLep   = 2181 2139 4320
   eff = inf 0.00493042 0.812471 inf 0.00483545 0.804582 inf
Pass Z mass    = 1865 1813 3678
 Fail GenEle match = 117 115 232
Passed Zs      = 1772 1721 3493
   eff = inf 0.00493042 0.812471 inf 0.00483545 0.804582 inf
Passed ZsW     = 8.10845e+06 8.03766e+06 16146110
Passed ZsW2    = 0 0 0
Sum LepID      = 7.99145e+07 7.76145e+07 157529000
Sum Pileup     = 7.95332e+07 7.88391e+07 158372300
          = === === === === 0
Spike                                      = 2089 2077 4166
pt                                          = 1281 1304 2585
pt, eta                                     = 1169 1198 2367
dR                                          = 1080 1095 2175
EleVeto                                     = 1029 1038 2067
EleVeto, HoE                                = 752 729 1481
EleVeto, HoE, sie                           = 197 199 396
EleVeto, HoE, sie, ChargdHad                = 10 12 22
EleVeto, HoE, sie, ChargdHad, NeuHad        = 9 9 18
EleVeto, HoE, sie, ChargdHad, NeuHad, phopf = 8 8 16
----------------------------------------------------------= ---------------------------------------------------------- ---------------------------------------------------------- 0
PhoID pass events = 8 8 16
Passed events  = 7 8 15
  eff = 0.00395034 0.00464846 0.0085988
Passed PU evt  = 7.40527 6.24361 13.6489

which I am guessing is closer to what you expected.
# 3  
Old 11-18-2012
Hi Don,
Sorry for replying late. Was busy with some other task.

And Big Thanks for the reply.Smilie it worked pretty well. But I am surprise I had total no of files as 48 but it did not give me the information of all 48 files..SmilieSmilie

Are you seeing any obvious reason for that? Please let me know

Greetings
emily
# 4  
Old 11-18-2012
Some details about what it did give you might help:
  1. Did it give you all of the data you wanted for a subset of the files?
  2. Did it give you all of the data you wanted from some lines from all files, but not for other lines?
  3. What is the output on your system from the command getconf LINE_MAX?
  4. What is the output from the command ls -l *.list?
  5. What is the output from the command?:
    Code:
    awk 'FNR==1{if(m)printf("%6d\t%s\n",m,f)
            sm+=m+1;f=FILENAME;m=length($0);n++}
    length($0)>m{m=length($0)}
    END{printf("%6d\t%s\n%6d\tTotal for %d files\n",++m,f,sm+m,n)}' *.list

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk, sed, shell all words in INPUT.txt find in column1 of TABLE.txt and replce with column2 in

Hi dears i have text file like this: INPUT.txt 001_1_173 j nuh ]az 001_1_174 j ]esma. nuh ]/.xori . . . and have another text like this TABLE.txt j j nuh word1... (6 Replies)
Discussion started by: alii
6 Replies

2. Shell Programming and Scripting

Desired output.txt for reading txt file using awk?

Dear all, I have a huge txt file (DATA.txt) with the following content . From this txt file, I want the following output using some shell script. Any help is greatly appreciated. Greetings, emily DATA.txt (snippet of the huge text file) 407202849... (2 Replies)
Discussion started by: emily
2 Replies

3. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

4. Shell Programming and Scripting

awk append fileA.txt to growing file B.txt

This is appending a column. My question is fairly simple. I have a program generating data in a form like so: 1 20 2 22 3 23 4 12 5 43 For ever iteration I'm generating this data. I have the basic idea with cut -f 2 fileA.txt | paste -d >> FileB.txt ???? I want FileB.txt to grow, and... (4 Replies)
Discussion started by: theawknewbie
4 Replies

5. UNIX for Dummies Questions & Answers

can't read a .txt file

Hello, I have a set of .txt files I cannot read. This is a part of what I see. Is there a way to view these files? _MO<P.6D@K;WU<B$X-;)SIV/ROO!UL+1P=VTT-?,SLC`MI/6QMS#UYGGT\+)C=#\UIO`TL/0]=#/T) it's about 3 pages. Thanks for your help. Joe (3 Replies)
Discussion started by: rcracerjoe
3 Replies

6. Shell Programming and Scripting

sed to read x.txt and grep from y.txt

How would I write a command(s) to read from a file (list) that looks like this: 29847374384 and grep from a second file (list) that looks like this: 29847374384, jkdfkjdf,3833,ddd:confused: (1 Reply)
Discussion started by: smellylizzard
1 Replies

7. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies

8. UNIX for Dummies Questions & Answers

How to read from txt file and use that as an array

Hi Guys How u all doing? I am having tough time to achieve this I have a unix .ksh script which calls sql script Right now I harcoded column id's in sql script but I want to read them from a txt file 1084,1143,1074,1080,1091,1090,1101,1069,1104,1087,1089,1081 I want to read this... (4 Replies)
Discussion started by: pinky
4 Replies

9. UNIX for Dummies Questions & Answers

How to read last line of a txt file?

I need to read the last file for a particular day, such as, "Jun 13" because the CSV file is cumulative for the entire day, so I don't want all the previous files, I just want the last file, for that day. I ran an 'ls -al | grep "June 13" > myLs.txt' (simplified) to list all files from that day.... (2 Replies)
Discussion started by: yongho
2 Replies

10. UNIX for Dummies Questions & Answers

want to read txt fileline by line

Hi , i want to read a text file line by line , is there any unix command or utility for this ? please tell me how to use it also ? (2 Replies)
Discussion started by: dharmesht
2 Replies
Login or Register to Ask a Question