using awk to count no of records based on conditions


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting using awk to count no of records based on conditions
# 1  
Old 06-05-2009
using awk to count no of records based on conditions

Hi

I am having files with date and time stamp as the folder names like 200906051400,200906051500,200906051600 .....hence everyday 24 files will be generated

i need to do certain things on this 24 files daily

file contains the data like
Code:
200906050016370   0   1244141195225298lessrv3       BSNLSERVICE1                  BSNLSERVICE1                  2128                                                        LOCATIONMANAGER          SLIR                 919443200299   MSISDN  ASC   919443200299   0   SUCCESS                                           1244141195225298less      919443200299        124414      79.301938811.6885305NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84  
200906050016440   0   1244141197503299lessrv3       BSNLSERVICE1                  BSNLSERVICE1                  2139                                                        LOCATIONMANAGER          SLIR                 919449838266   MSISDN  ASC   919449838266   0   SUCCESS                                           1244141197503299less      919449838266        124414      74.739722013.3302837NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84  
200906050017070   0   1244141224604306lessrv3       BSNLSERVICE1                  BSNLSERVICE1                  2128                                                        LOCATIONMANAGER          SLIR                 919448010097   MSISDN  ASC   919448010097   1   SYSTEM FAILURE                                    1244141224604306less      919448010097        124414                          NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84  
200906050017110   0   1244141227460308lessrv3       BSNLSERVICE1                  BSNLSERVICE1                  2128                                                        LOCATIONMANAGER          SLIR                 919449838266   MSISDN  ASC   919448010098   1   SYSTEM FAILURE                                    1244141227460308less      919449838266   124414                          NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84  

20090605140148        1204702370366140lessrv3                                     RTMS                          0                                                           TRACKING                 tlrep                                                            0   SUCCESS                                                                                         1                                                                                                                                                                                                                                                                                                                                     WGS84  
200906051402100   0   1195202147789210lessrv3       RTMS                          RTMS                                                                                      LOCATIONMANAGER          SLIR                 919446001620   MSISDN  ASC   919446001620   526 INACTIVE SUBSCRIBER                               1195202147789210less                          124419                          NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84  
200906051402100   0   1195202147789210lessrv3       RTMS                          RTMS                                                                                      LOCATIONMANAGER          SLIR                 919446001618   MSISDN  ASC   919446001618   526 INACTIVE SUBSCRIBER                               1195202147789210less                          124419                          NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84  
200906051402100   0   1195202147789210lessrv3       RTMS                          RTMS                                                                                      LOCATIONMANAGER          SLIR                 919446001617   MSISDN  ASC   919446001617   526 INACTIVE SUBSCRIBER                               1195202147789210less                          124419                          NORMAL    DELAY_                                                                                                                                                                                                                                                                                      WGS84


i need the scripts to do the following
1. It has to filter the records based on $4 and $6 (i.e $4==BSNLSERVICE1 AND $6==2128 ) and count the total records for the DAY (20090605*)


OUTPUT REQUIRED:
BSNLSERVICE1 2128 ==3


2. It has to filter the records based on $4 and $6 (i.e $4==BSNLSERVICE1 AND $6==2128) and count the total records for the DAY (20090605*) AND GROUP BASED ON $14 (i.e SUCCESS, FAILURE)


OUTPUT REQUIRED:
BSNLSERVICE1 2128 SUCCESS == 1
BSNLSERVICE1 2128 SYSTEM FAILURE ==2


3. It has to filter the records based on $4 and $6 (i.e $4==BSNLSERVICE1 AND $6==2128), GROUP BASED ON $9 (i.e 919448010098, 919446001618 ) and count the total records for the DAY (20090605*) FOR EACH DISTINCT $9


OUTPUT REQUIRED:
919449838266 2
919448010097 1


output should be of $4==BSNLSERVICE1 AND $6==2128 only ..other things ($4==RTMS) are not required.

Help me out pls

Last edited by aemunathan; 06-05-2009 at 09:33 AM..
# 2  
Old 06-05-2009
something like this you can try :
Code:
awk '$4=="BSNLSERVICE1"&&$6=="2128" { count++ } END { print "BSNLSERVICE1-->2128-->" count }'  file_name.txt

remaining also almost same . just a bit modification in the script needed.
# 3  
Old 06-05-2009
Try this:

Code:
awk -v day="20090605" -v serv="BSNLSERVICE1" -v val="2128" '
$1 ~ day && $4==serv && $6==val {
  s1++;a[$14]++;b[$9]++
}
END{
  print serv,val, "=" s1 "\n"
  print serv,val, a["SUCCESS"]
  print serv,val, a["SYSTEM"] "\n"
  for(i in b){print i, b[i]} 
}' file

Regards
# 4  
Old 06-05-2009
Hi panyam and Franklin

i followed the method suggested by panyam its giving useful result.

here it goes as in the order i requested.

Code:
1.
awk '$4=="BSNLSERVICE1"&&$6=="2128" { count++ } END { print "BSNLSERVICE1-->2128-->" count }' 20090604*

2. 

awk '$4=="BSNLSERVICE1"&&$6=="2128"{ b[$14]++}  END {for(i in b){print i, b[i]}  }' 20090604*

3.
awk '$4=="BSNLSERVICE1"&&$6=="2128"{ b[$9]++}  END {for(i in b){print i, b[i]}  }' 20090604*

one thing i need to know whether its possible to derive filename from date command.

actually i need to schedule it every night at 2:00 am and i need to derive the filename from the date command

lets take the example
tonight 2.00 am the ouput of the
Code:
date +'%Y%m%d'

is 20090606
i need to give the file name as 20090605* in the filename part of the awk ...


and for franklin ....i used this way
Code:
#!/usr/xpg4/bin/awk 
awk -v day="20090605" -v serv="BSNLSERVICE1" -v val="2128" '
$1 ~ day && $4==serv && $6==val {
  s1++;a[$14]++;b[$9]++
}
END{
  print serv,val, "=" s1 "\n"
  print serv,val, a["SUCCESS"]
  print serv,val, a["SYSTEM"] "\n"
  for(i in b){print i, b[i]} 
}' 200906051859

and got the response as
Quote:
./reconcil.sh
/usr/xpg4/bin/awk: syntax error Context is:
>>> ./ <<<
thank u
# 5  
Old 06-05-2009
To get the date of yesterday you can use the datecalc script of Perderabo.
Place this script in the same directory of your script with the name datecalc and make it executable:

https://www.unix.com/unix-dummies-que...html#post16559

Your script should looks like:

Code:
#!/bin/ksh

dat=$(./datecalc -a $(date +"%Y %m %d") - 1)

day=$(/usr/xpg4/bin/awk -v d="$dat" 'BEGIN {split(d,a," ");day=sprintf("%s%02s%02s",a[1],a[2],a[3]);print day}')

/usr/xpg4/bin/awk -v d="$day" -v serv="BSNLSERVICE1" -v val="2128" '
$1 ~ day && $4==serv && $6==val {
  s1++;a[$14]++;b[$9]++
}
END{
  print "Filename: " FILENAME "\n"
  print serv,val, "=" s1 "\n"
  print serv,val, a["SUCCESS"]
  print serv,val, a["SYSTEM"] "\n"
  for(i in b){print i, b[i]} 
}' $day*

# 6  
Old 06-06-2009
Hi

Thanks a lot man...its nice to see the result...

I need one more info. actually am using sqlloader to insert the result in to a table. Here i want to print the previous date as well in dd-mon-yyyy format

i tried in this way
Code:
#!/bin/ksh

dat=$(./datecalc -a $(date +"%Y %m %d") - 1)

da_te=$(date +'%d')

da=$(($da_te-1))

mon=$(date +'%b')

year=$(date +'%Y')

host=$(hostname)

day=$(/usr/xpg4/bin/awk -v d="$dat" 'BEGIN {split(d,a," ");day=sprintf("%s%02s%02s",a[1],a[2],a[3]);print day}')

/usr/xpg4/bin/awk -v d="$day" -v serv="BSNLSERVICE1" -v val="2128"  -v daet="$(($da)-($mon)-($year))" -v ho="$host"'
$1~day && $4==serv && $6==val {
  s1++
}
END{
  print daet, host,s1 
}' $day*

Help me out !!!! the report is of previous day so i need to use the previous date while printing.
Thanks in advance

Last edited by aemunathan; 06-07-2009 at 02:36 PM.. Reason: one more requirement!!!!!!
# 7  
Old 06-07-2009
Hi

Please guide me for getting the previous date in the format dd-mon-yyyy (06-Jun-2009) to load in to the database table

Thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to assign points to variables based on conditions and update specific field

I have been reading old posts and trying to come up with a solution for the below: Use a tab-delimited input file to assign point to variables that are used to update a specific field, Rank. I really couldn't find too much in the way of assigning points to variable, but made an attempt at an awk... (4 Replies)
Discussion started by: cmccabe
4 Replies

2. Shell Programming and Scripting

Awk/sed/cut to filter out records from a file based on criteria

I have two files and would need to filter out records based on certain criteria, these column are of variable lengths, but the lengths are uniform throughout all the records of the file. I have shown a sample of three records below. Line 1-9 is the item number "0227546_1" in the case of the first... (15 Replies)
Discussion started by: MIA651
15 Replies

3. Shell Programming and Scripting

awk to update file based on 5 conditions

I am trying to use awk to update the below tab-delimited file based on 5 different rules/conditions. The final output is also tab-delimited and each line in the file will meet one of the conditions. My attemp is below as well though I am not very confident in it. Thank you :). Condition 1: The... (10 Replies)
Discussion started by: cmccabe
10 Replies

4. Shell Programming and Scripting

awk to filter file based on seperate conditions

The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV that will only print that line if CI= must be >.05 . The other condition to add is if SVTYPE=Fusion, then in order to print that line READ_COUNT must... (3 Replies)
Discussion started by: cmccabe
3 Replies

5. Shell Programming and Scripting

count the unique records based on certain columns

Hi everyone, I have a file result.txt with records as following and another file mirna.txt with a list of miRNAs e.g. miR22, miR123, miR13 etc. Gene Transcript miRNA Gar Nm_111233 miR22 Gar Nm_123440 miR22 Gar Nm_129939 miR22 Hel Nm_233900 miR13 Hel ... (6 Replies)
Discussion started by: miclow
6 Replies

6. Shell Programming and Scripting

Extract file records based on some field conditions

Hello Friends, I have a file(InputFile.csv) with the following columns(the columns are pipe-delimited): ColA|ColB|ColC|ColD|ColE|ColF Now for this file, I have to get those records which fulfil the following condition: If "ColB" is NOT NULL and "ColD" has values one of the following... (9 Replies)
Discussion started by: mehimadri
9 Replies

7. Shell Programming and Scripting

awk merging files based on 2 complex conditions

1. if the 1st row IDs of input1 (ID1/ID2.....) is equal to any IDNames of input2 print all relevant values together as defined in the output. 2. A bit tricky part is IDno in the output. All we need to do is numbering same kind of letters as 1 (aa of ID1) and different letters as 2 (ab... (4 Replies)
Discussion started by: ruby_sgp
4 Replies

8. Shell Programming and Scripting

Awk to Count Records with not null

Hi, I have a pipe seperated file I want to write a code to display count of lines that have 20th field not null. nawk -F"|" '{if ($20!="") print NR,$20}' xyz..txt This displays records with 20th field also null. I would like output as: (4 Replies)
Discussion started by: pinnacle
4 Replies

9. Shell Programming and Scripting

Record count based on a keyword in the records

Hi, Am having files with many records, i need to count and display the number of records based on the keyword in one of the column of the records. for e.g THE FILE CONTAINS TWO RECORDS LIKE. 200903031143150 0 1236060795054357lessrv1 BSNLSERVICE1 BSNLSERVICE1 ... (4 Replies)
Discussion started by: aemunathan
4 Replies

10. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies
Login or Register to Ask a Question