Count no of occurrence of the strings based on column value


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Count no of occurrence of the strings based on column value
# 1  
Old 04-26-2011
Count no of occurrence of the strings based on column value

Can anyone help me to count number of occurrence of the strings based on column value. Say i have 300 files with 1000 record length from which i need to count the number of occurrence string which is existing from 213 to 219. Some may be unique and some may be repeated.
# 2  
Old 04-26-2011
Can you post an example -- a couple lines of input and desired output?
# 3  
Old 04-26-2011
Test data (Input data):

IN46236 TRE 317-691-741947558020901EN1HGCM82614A0001722004HONDA ACCORD U-S- 0001-01-012011-04-112011-04-11A12345FBurd Ford, Inc.
GA30214 TRE 000-000-000094536775208EN1HGCM72673A0013972003HONDA ACCORD U-S- 0001-01-012011-04-072011-04-07B12345LAllan Vigil Ford of Fayetteville, Inc.
CO80503 TRE 720-438-456709740713120ENJHLRE48569C0025272009HONDA CR-V 0001-01-012011-04-092011-04-08C23456ALongmont Ford
AZ85743 TRE 000-000-000058715232407ENJTEDW21AX600040012006TOYOTA HIGHLANDER 0001-01-012011-04-092011-04-09D78636CHolmes Tuttle Ford, Lincoln-Mercury
NH03055 TRE 603-867-465107757895420EN5J6YH28574L0059762004HONDA ELEMENT 0001-01-012011-04-122011-04-12A12345LBest Ford Lincoln
GA30540 TRE 706-635-589558241949120ENJTMZF33V29D0062112009TOYOTA RAV4 0001-01-012011-04-072011-04-07A12345DRonnie Thompson Ford-Mercury
CA92376 TRE 909-877-153353033524707ENKNAFE1624750071172007KIA SPECTRA5 0001-01-012011-04-122011-04-12C23456DCitrus Motors
AR71646 TRE 870-853-572147762329806EN1NXBU40E79Z0075622009TOYOTA COROLLA 0001-01-012011-04-072011-04-07A12345VRyburn Motor Company, Inc.
MN56071 TRE 952-686-146403434361620EN4A3AB76S66E0108782006MITSUBIGALANT 0001-01-012011-04-062011-04-06A12345XNew Prague Ford Lincoln Mercury
MD21286 TRE 410-337-722287552922909ENJTEBT14R1480109092004TOYOTA 4RUNNER 0001-01-012011-04-122011-04-12B12345LBob Davidson Ford Lincoln
FL33067 TRE 954-592-941240740545220ENJNKCV61EX9M0114092009INFINITG37 0001-01-012011-04-092011-04-09A12345LMaroone Ford of Margate
MI49224 TRE 614-307-460899043860620EN5J6RE4H35BL0119092011HONDA CR-V 0001-01-012011-04-012011-04-01A12345VAlbion Motors Ford, Inc.

Output:
Out of 12 lines we have 7 strings repeated and reset of them are 2 and 1 which are existing in same column value.
A12345 - 7
B12345 - 2
C23456 - 2
D78636 - 1
# 4  
Old 04-26-2011
It's still not very clear to me. But I'll give it a shot in the dark:
Code:
$ awk -F"-[0-9]+" '{s=gensub(/([0-9])[^0-9]*/,"\\1","g",$NF); cnt[s]++}END{for(i in cnt){print i " " cnt[i]|"sort"}}' input
A12345 7
B12345 2
C23456 2
D78636 1

THe awk command works as follows (explanation on the first line of input):
It takes the last field after dash and number(s):
Code:
IN46236 TRE 317-691-741947558020901EN1HGCM82614A0001722004HONDA ACCORD U-S- 0001-01-012011-04-112011-04-11A12345FBurd Ford, Inc.

, then it throws away everything beyond the last number
Code:
A12345FBurd Ford, Inc.

and counts the occurrences of these.
# 5  
Old 04-26-2011
Thanks for your reply. Since file length is 1000 i have not posted entire line. Number value exits after 213-219 also. My requirement is fetch all string from 213 to 219 position from all 300 files and count the number of occurrence.
For example cut -c 213-219 filename then count the occurrence.
# 6  
Old 04-26-2011
Code:
cut -c213-219 yourfile* | sort | uniq -c

?
# 7  
Old 04-26-2011
uniq is not required. We need to find the counts (occurrence of the strings based on column value for multilple files).
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

2. Shell Programming and Scripting

Count of occurrence in particular column of the file.

Hi All, let's say an input looks like: C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11 ---------------------------------- 1|0123452|C501|Z|Z|Z|E|E|E|E|E|E|E 1|0156123|C501|X|X|X|E|E|E|E|E|E|E 1|0178903|C501|Z|Z|Z|E|E|E|E|E|E|E 1|0127896|C501|Z|Z|Z|E|E|E|E|E|E|E 1|0981678|C501|X|X|X|E|E|E|E|E|E|E ... (6 Replies)
Discussion started by: suresh_target
6 Replies

3. UNIX for Dummies Questions & Answers

Count occurrence of string (based on type) in a column using awk

Hello, I have a table that looks like what is shown below: AA BB CC XY PQ RS AA BB CC XY RS I would like the total counts depending on the set they belong to: if search pattern is in {AA, BB, CC} --> count them as Type1 | wc -l (3 Replies)
Discussion started by: Gussifinknottle
3 Replies

4. Programming

awk to count occurrence of strings and loop for multiple columns

Hi all, If i would like to process a file input as below: col1 col2 col3 ...col100 1 A C E A ... 3 D E G A 5 T T A A 6 D C A G how can i perform a for loop to count the occurences of letters in each column? (just like uniq -c ) in every column. on top of that, i would also like... (8 Replies)
Discussion started by: iling14
8 Replies

5. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies

6. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Hi, I have a similar input format- A_1 2 B_0 4 A_1 1 B_2 5 A_4 1 and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks! letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies

7. Shell Programming and Scripting

Count occurrence of string in a column using awk

Hi, I want to count the occurrences of strings in a column and display as in example below: Input: get1 345 789 098 get2 567 982 090 fet4 777 610 632 get1 800 544 230 get1 600 788 451 get2 892 321 243 get1 673 111 235 fet3 789 220 278 fet4 768 222 341 output: 4 get1 345 789... (7 Replies)
Discussion started by: aydj
7 Replies

8. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Could anybody help with this? I have input below ..... david,39 david,39 emelie,40 clarissa,22 bob,42 bob,42 tim,32 bob,39 david,38 emelie,47 what i want to do is count how many names there are with different ages, so output would be like this .... david,2 emelie,2 clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies

9. UNIX for Dummies Questions & Answers

Occurrence of two strings in a line of a file

Hi all, is there a way to find the occurrence of two strings in a line of a file? e.g. I have XXXX yyyyy zzzz 111 XXXX yyyyy zzzz 222 XXXX yyyyy zzzz 333 XXXX yyyyy zzzz 444 and want to find in one shot XXXX yyyyy zzzz 222 Thank you,... (9 Replies)
Discussion started by: f_o_555
9 Replies

10. Shell Programming and Scripting

count a occurrence

I am looking to get a output of "2 apple found" from the awk command below. black:34104 tomonorisoejima$ cat tomo apple apple black:34104 tomonorisoejima$ awk '/apple/ {count++}END{print count " apple found"}' tomo 1 apple found black:34104 tomonorisoejima$ (5 Replies)
Discussion started by: soemac
5 Replies
Login or Register to Ask a Question