The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Remove Similar Lines from a File Nysif Steve Shell Programming and Scripting 4 03-13-2009 03:49 AM
Appending text to a number of similar filenames Djaunl UNIX for Dummies Questions & Answers 4 10-10-2007 12:16 AM
counting the number of lines nayeemmz Linux 6 01-19-2005 12:37 PM
Counting the number of lines in ASCII file alarmcall Shell Programming and Scripting 8 08-26-2003 07:53 PM
Counting The Number Of Duplicate Lines In a File crunchtime UNIX for Dummies Questions & Answers 2 07-04-2003 02:24 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 05-14-2008
dhanamurthy dhanamurthy is offline
Registered User
  
 

Join Date: Jun 2006
Posts: 105
awk - Counting number of similar lines

Hi All

I have the input file OMAK_11.

OMAK 000002EXCLUDE 1341
OMAK 000002EXCLUDE 1341
OMAK 000002EXCLUDE 1341
OMAK 000003EXCLUDE 1341
OMAK 000003EXCLUDE 1341
OMAK 000003EXCLUDE 1341
OMAK 000004EXCLUDE 1341
OMAK 000004EXCLUDE 1341
OMAK 000004EXCLUDE 1341
OMAK 000004EXCLUDE 1341
OMAK 000005EXCLUDE 1341
OMAK 000005EXCLUDE 1341
OMAK 000005EXCLUDE 1341

I want the output as

OMAK EXCLUDE 000002 3 1341
OMAK EXCLUDE 000003 3 1341
OMAK EXCLUDE 000004 4 1341
OMAK EXCLUDE 000005 3 1341



I have this program
which is doing quite well. Except for the last line where i could not get any output. There is something to do with END of awk.

awk '{ curr=substr($0,1,11)

if ( curr != prev && prev != "")
{
a=sprintf("%s %-50s %6s %-6s %s",substr(prev_0,1,5),substr(prev_0,12,29),substr(prev_0,6,6),count,substr(prev_0,41,4))
print a
count=0
}
count++
prev=curr
prev_0=$0
} END {a=sprintf("%s %-50s %6s %-6s %s",substr($0,1,5),substr($0,12,29),substr($0,6,6),count,substr($0,41,4))
print a
}' OMAK_11


Can any one tell me how to fix this?

Regards
Dhana
  #2 (permalink)  
Old 05-14-2008
Annihilannic Annihilannic is offline Forum Advisor  
  
 

Join Date: May 2008
Location: Sydney, Australia
Posts: 1,009
Incidentally, instead of a=sprintf(...); print a you can just use printf(...).

Maybe $0 is undefined when you reach the end clause... if you cange print a to print $0 in that last section does it print the last line of input?
  #3 (permalink)  
Old 05-14-2008
Annihilannic Annihilannic is offline Forum Advisor  
  
 

Join Date: May 2008
Location: Sydney, Australia
Posts: 1,009
By the way, you could also use uniq -c and rearrange the order of the output columns using awk.
  #4 (permalink)  
Old 05-14-2008
shamrock shamrock is offline Forum Advisor  
Registered User
  
 

Join Date: Oct 2007
Location: USA
Posts: 753
Code:
[n]awk '{
  c[$0]++
  split($2, m, /[A-Z]+/)
  split($2, n, /[0-9]+/)
  a[$1" "n[2]" "m[1]]=c[$0]" "$3
} END {for(i in a) print i, a[i]}' file
  #5 (permalink)  
Old 05-15-2008
dhanamurthy dhanamurthy is offline
Registered User
  
 

Join Date: Jun 2006
Posts: 105
awk - counting number of similar lines

Hi
Thanks for the information provided.
I read the source code that you have proivded. For eg I have the below said data.

SIZEC000002EXCLUDE 1341
SIZEC000002EXCLUDE 1341
SIZEC000002EXCLUDE 1341
SIZEC000003EXCLUDE 1341
SIZEC000003EXCLUDE 1341
SIZEC000003EXCLUDE 1341
SIZEC000004EXCLUDE 1341
SIZEC000004EXCLUDE 1341
SIZEC000004EXCLUDE 1341
SIZEC000004EXCLUDE 1341
SIZEC000005EXCLUDE 1341
SIZEC000005EXCLUDE 1341
SIZEC000005EXCLUDE 1341

I have two questions
a] What is the purpose of having these statements if input is the above said data

split($2, m, /[A-Z]+/)
split($2, n, /[0-9]+/)
as $2 will not have any values of alphabets.
OR is it necessary to have both m and n.


b] If i have the below data

SIZEC000004EXCLUDE 1380
SIZEC000004EXCLUDE 1382
SIZEC000005EXCLUDE 1340
SIZEC000005EXCLUDE 1341
SIZEC000005EXCLUDE 1342

I want to group the datas like the below

SIZEC000004EXCLUDE 1380 1382
SIZEC000005EXCLUDE 1340 1341 1342

Is awk having any standard functions to do it.

Regards
Dhana
  #6 (permalink)  
Old 05-15-2008
Annihilannic Annihilannic is offline Forum Advisor  
  
 

Join Date: May 2008
Location: Sydney, Australia
Posts: 1,009
Use an array indexed by $1, and append $2 to it as you process each line.
  #7 (permalink)  
Old 05-15-2008
dhanamurthy dhanamurthy is offline
Registered User
  
 

Join Date: Jun 2006
Posts: 105
Awk - Grouping Lines

Hi All

I have the input file as

INFOR00028114 GRAINS BAKERY 4000
INFOR00028114 GRAINS BAKERY 4000
INFOR00028114 GRAINS BAKERY 4000
INFOR0009183-RIVERS - IC 2672
INFOR0009183-RIVERS - IC 2672
INFOR0009183-RIVERS - IC 2672
INFOR0009183-RIVERS - IC 2671

I want the output like
BRAND 14 GRAINS BAKERY 000281 3 4000
BRAND 3-RIVERS - IC 000918 1 2671
BRAND 3-RIVERS - IC 000918 3 2672
BRAND 5 STAR 001972 2 3618



The Layout would be like
postion 1-5 for NAME1
position 6-6 for NAME2
position 12-41 for NAME3
position 42-46 for NAME4

I framed the below logic but i am getting the output like
BRAND 14 GRAINS BAKERY 000281 3 4000
BRAND 3-RIVERS - IC 000918 1 2671
BRAND 5 STAR 001972 2 3618
which is not that expected.

awk '{
c[$0]++
a=substr($0,1,5)
b=substr($0,12,30)
ff=substr($0,6,6)
d=substr($0,42,4)
j[a" "b" "ff]=c[$0]" " d
}END {for(i in j) print i, j[i]}' tes|sort

I am not sure what needs to be changed.
Can any one help me?

Regards
Dhana
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:53 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0