awk to print count for chars recurrence


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to print count for chars recurrence
# 1  
Old 03-11-2011
awk to print count for chars recurrence

input:

Code:
123456 1111
124567 2222
125678 3333
234567 aaaa
456789 abcd

awk logic:
- read lines for recurring 1st 2 chars of the 1st field
- if recurrence detected count up and print value

output:

Code:
1 123456 1111
2 124567 2222
3 125678 3333
1 234567 aaaa
1 456789 abcd

# 2  
Old 03-11-2011
Code:
 nawk '{print ++a[substr($1,1,2)], $0}' myFile

This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 03-14-2011
Thanks.. it works like charm.

Code:
1 123456 1111
2 124567 2222
3 125678 3333
1 234567 aaaa
1 456789 abcd

Next, instead of looping the lines, how can i use awk to print 1 line when counter (field 1) is greater than 1?

Desired output:
Code:
3 125678 3333

I've tried this:
Code:
$ cat testdata | awk 'BEGIN { val=0;line=1; } { if ( $1 > val ) { val=$1 ; line=$0 ; } } END { print line }'
3 125678 3333

However, when we have additional lines that meet the threshold condition, the above awk logic doesn't work.

For example, when testdata is:
Code:
1 123456 1111
2 124567 2222
3 125678 3333
1 234567 aaaa
2 235678 bbbb
1 456789 abcd

The expected output is:
Code:
3 125678 3333
2 235678 bbbb

But the outcome is:
Code:
$ cat testdata | awk 'BEGIN { val=0;line=1; } { if ( $1 > val ) { val=$1 ; line=$0 ; } } END { print line }'
3 125678 3333


Last edited by ux4me; 03-14-2011 at 04:49 PM.. Reason: additional code tags
# 4  
Old 03-14-2011
Code:
nawk '++a[substr($1,1,2)] > 1' myFile

Actually you probably meant this:
Code:
nawk '{i=substr($1,1,2);++c[i];d[i]=$0}END{for(i in d) if (c[i] > 1) print c[i], d[i]}' myFile


Last edited by vgersh99; 03-14-2011 at 05:51 PM..
This User Gave Thanks to vgersh99 For This Post:
# 5  
Old 03-14-2011
Thanks again. Actually the desired output are the lines with the maximum counter value.

For example:
Code:
1 123456 1111
2 124567 2222
3 125678 3333
1 234567 aaaa
2 235678 bbbb
3 237890 cccc
1 456789 abcd
2 451234 efgh

Desired output with awk:
Code:
3 125678 3333
3 237890 cccc
2 451234 efgh

Once I have the output lines I can process the line to compare field 1 if greater than 1 - if so, do something.

Last edited by ux4me; 03-14-2011 at 06:23 PM.. Reason: fixed the code tags
# 6  
Old 03-14-2011
I don't quite follow....
What's your original input?
It seems that your input format has changed....
We don't need to to create the intermediate output to be reprocessed - we can produce whatever is desired in one step.

My second suggestion hsould give you what you want given your original file format....
This User Gave Thanks to vgersh99 For This Post:
# 7  
Old 03-14-2011
You are so correct! it was ID10T error on my part. I used the wrong field, $3 instead of $1. Thanks again for your help.

--GOOD STUFF!--
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to split data with a delimiter having chars and special chars

Hi Team, I have a file a1.txt with data as follows. dfjakjf...asdfkasj</EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><! The delimiter string: <SelectStatement modified='1' type='string'><! dlm="<SelectStatement modified='1' type='string'><! The above command is... (7 Replies)
Discussion started by: kmanivan82
7 Replies

2. Shell Programming and Scripting

sed - print only the chars that match a given set in a string

For a given string that may contain any ASCII chars, i.e. that matches .*, find and print only the chars that are in a given subset. The string could also have numbers, uppercase, special chars such as ~!@#$%^&*(){}\", whatever a user could type in without going esoteric For simplicity take... (1 Reply)
Discussion started by: naderra
1 Replies

3. UNIX for Beginners Questions & Answers

How to print 7 chars from left of filemane?

Hi Guys Kindly help me on my predicament. I want to get the 7 chars of this filename I need to get the ALL.cmd from this filehame filename_a_ALL.cmd Thanks (10 Replies)
Discussion started by: cmarzan
10 Replies

4. Shell Programming and Scripting

awk to print before and after lines then count of patterns

What i'm trying to do here is show X amount of lines before and after the string "serialNumber" is found. BEFORE=3 AFTER=2 gawk '{a=$0} {count=0} /serialNumber/ && /./ {for(i=NR-'"${BEFORE}"';i<=NR;i++){count++ ;print a}for(i=1;i<'"${AFTER}"';i++){getline; print ; count ++; print... (5 Replies)
Discussion started by: SkySmart
5 Replies

5. Shell Programming and Scripting

Awk: Print count for column in a file using awk

Hi, I have the following input in a file & need output as mentioned below(need counter of every occurance of field which is to be increased by 1). Input: 919143110065 919143110065 919143110052 918648846132 919143110012 918648873782 919143110152 919143110152 919143110152... (2 Replies)
Discussion started by: siramitsharma
2 Replies

6. Shell Programming and Scripting

Find Special/Null/Control Chars and Print Line Numbers

Hi All, This might be a basic question... I need to write a script to find all/any Speacial/Null/Control Chars and Print Line Numbers from an input file. Output something like Null Characters in File Name at : Line Numbers Line = Print the line Control Characters in File Name at : Line... (2 Replies)
Discussion started by: Kevin Tivoli
2 Replies

7. Shell Programming and Scripting

print all between patterns with special chars

Hi, I'm having trouble with awk print all characters between 2 patterns. I tried more then one solution found on this forum but with no success. Probably my mistakes are due to the special characters "" and "]"in the search patterns. Well, have a log file like this: logfile.txt ... (3 Replies)
Discussion started by: ginolatino
3 Replies

8. Shell Programming and Scripting

how to find out recurrence and print it.

Hi all, I have a file having data like this: rs4332761 15XB rs4332761 unk rs4571228 15XB rs457263 5XB rs4606515 10XA rs4606515 10XB rs4606515 15XB I want output like this: rs4332761 15XB,unk rs4571228 15XB rs457263 5XB rs4606515 10XA,10XB,15XB I... (2 Replies)
Discussion started by: pirates.genome
2 Replies

9. Shell Programming and Scripting

[Solved] print chars of a string

how can i print all the chars of a string one by line? i have thought that use a for cicle and use this command inside: ${VARIABLE:0:last}but how can i make last? because string is random P.S. VARIABLE is the string or can i make a variable for every chars of this string? this was my idea... (10 Replies)
Discussion started by: tafazzi87
10 Replies

10. Shell Programming and Scripting

count identical strings print last row and count

I have a sorted file like: Apple 3 Apple 5 Apple 8 Banana 2 Banana 3 Grape 31 Orange 7 Orange 13 I'd like to search $1 and if $1 is not the same as $1 in the previous row print that row and print the number of times $1 was found. so the output would look like: Apple 8 3 Banana... (2 Replies)
Discussion started by: dcfargo
2 Replies
Login or Register to Ask a Question