Count the repetition of a Field in File

08-16-2009

Registered User

5, 0

Join Date: Aug 2009

Last Activity: 16 October 2009, 9:28 AM EDT

Posts: 5

Thanks Given: 0

Thanked 0 Times in 0 Posts

Count the repetition of a Field in File

Hi,
Thanks for keeping such a help-full platform active and live always.
I am new to this forum and to unix also.
Want to know how to count the repetition of a field in a file. Anything of awk, sed, perl, shell script, solution are welcomed.

Input File------------------
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888

Output File-----------------
abc,12345,3
pqr,51223,1
mno,72121,2
stu,34567,1
aaa,12345,3
pqp,11224,1
plm,72121,2
zxy,88888,2
fgh,12345,3
jkl,88888,2

As 12345 is repeated 3 times in files as second field, so wherever it is "3" is suffixed as last field.
Thanks for the solution in advance.

Ace

indian.ace

View Public Profile for indian.ace

Find all posts by indian.ace

08-16-2009

Registered User

320, 81

Join Date: Aug 2009

Last Activity: 14 May 2019, 11:07 AM EDT

Location: France

Posts: 320

Thanks Given: 19

Thanked 81 Times in 76 Posts

Here is what I get so far. Of course, you'll surely have other replies that will do the same in a simpler way

Code:

#!/bin/sh

sort -t',' -k2,2n file | uniq -c -s4 > tmp

while read line; do
  echo "$line,$(grep ${line##*,} tmp | awk '{print $1}')"
done < file

exit 0

Your data file need to be named file, in the same directory as the script.
I use a tmp file to keep the number of occurences of the second field.

tukuyomi

View Public Profile for tukuyomi

Find all posts by tukuyomi

08-16-2009

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Ok, another one

Code:

awk -F, 'NR==FNR{a[$2]++;next}{print $0 "," a[$2]}' file file

Regards

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

08-17-2009

Registered User

1,305, 26

Join Date: Jun 2007

Last Activity: 11 November 2016, 3:44 AM EST

Location: Beijing China

Posts: 1,305

Thanks Given: 0

Thanked 26 Times in 26 Posts

how about below perl:

Code:

my (%result,%cnt);
while(<DATA>){
	chomp;
	my @tmp=split(",",$_);
	$result{$_}=$.;
	$cnt{$tmp[1]}++;
}
map  {s/([0-9]+)/$1.",".$cnt{$1}/e;print $_,"\n";} 
  sort {$result{$a} <=> $result{$b}} keys %result;
__DATA__
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888

summer_cherry

View Public Profile for summer_cherry

Find all posts by summer_cherry

08-17-2009

Registered User

5, 0

Join Date: Aug 2009

Last Activity: 16 October 2009, 9:28 AM EDT

Posts: 5

Thanks Given: 0

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by tukuyomi

Here is what I get so far. Of course, you'll surely have other replies that will do the same in a simpler way Smilie

Code:

#!/bin/sh

sort -t',' -k2,2n file | uniq -c -s4 > tmp

while read line; do
  echo "$line,$(grep ${line##*,} tmp | awk '{print $1}')"
done < file

exit 0

Your data file need to be named file, in the same directory as the script.
I use a tmp file to keep the number of occurences of the second field.

---------- Post updated at 03:31 AM ---------- Previous update was at 03:29 AM ----------

[/COLOR]Hi Tukuyomi
Thanks for the solution but it has a deviation than expected result, and eating out some inputs. The output was like this.
1 pqp,11224
3 aaa,12345
1 stu,34567
1 pqr,51223
2 mno,72121
2 jkl,88888

can you please amend it if possible.

---------- Post updated at 03:40 AM ---------- Previous update was at 03:31 AM ----------

Hi frank,
there is no output for this awk script, its just publishing the same optput as input except a field saparator at the end as ",". Please can you correct it.

indian.ace

View Public Profile for indian.ace

Find all posts by indian.ace

08-17-2009

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Quote:

Originally Posted by indian.ace

Hi frank,
there is no output for this awk script, its just publishing the same optput as input except a field saparator at the end as ",". Please can you correct it.

This is what I get:

Code:

$ cat file
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888
$ awk -F, 'NR==FNR{a[$2]++;next}{print $0 "," a[$2]}' file file
abc,12345,3
pqr,51223,1
mno,72121,2
stu,34567,1
aaa,12345,3
pqp,11224,1
plm,72121,2
zxy,88888,2
fgh,12345,3
jkl,88888,2

Am I missing something?

This User Gave Thanks to Franklin52 For This Post:

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

08-17-2009

Registered User

5, 0

Join Date: Aug 2009

Last Activity: 16 October 2009, 9:28 AM EDT

Posts: 5

Thanks Given: 0

Thanked 0 Times in 0 Posts

Franklin,This is what i am getting, as you know much more abt this you can find out if I am doing something wrong I have Solaris10 as OS.
root@sunmc01>cat file
abc,12345
pqr,51223
mno,72121
stu,34567
aaa,12345
pqp,11224
plm,72121
zxy,88888
fgh,12345
jkl,88888
root@sunmc01>awk -F, 'NR==FNR{a[$2]++;next}{print $0 "," a[$2]}' file file
abc,12345,
pqr,51223,
mno,72121,
stu,34567,
aaa,12345,
pqp,11224,
plm,72121,
zxy,88888,
fgh,12345,
jkl,88888,
abc,12345,
pqr,51223,
mno,72121,
stu,34567,
aaa,12345,
pqp,11224,
plm,72121,
zxy,88888,
fgh,12345,
jkl,88888,
root@sunmc01>
Thanks for your consistent support.

indian.ace

View Public Profile for indian.ace

Find all posts by indian.ace

Shell Programming and Scripting

Count the repetition of a Field in File

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk: count unique elements in a field and sum their occurence across the entire file

Discussion started by: beca123456

2. Shell Programming and Scripting

How to count the field and add String?

Discussion started by: vutung1991

3. Shell Programming and Scripting

Count of unique lines in field 4

Discussion started by: cmccabe

4. Shell Programming and Scripting

Count the field values in a file

Discussion started by: rkrish

5. Shell Programming and Scripting

Help with awk for selecting lines in a file avoiding repetition

Discussion started by: Homa

6. Shell Programming and Scripting

Read File and Display The Count of a particular field

Discussion started by: dbashyam

7. Shell Programming and Scripting

Count number of occurences of a character in a field defined by the character in another field

Discussion started by: s052866

8. Shell Programming and Scripting

How to check the repetition values in a file using bourne shell

Discussion started by: Nandagopal

9. Shell Programming and Scripting

Count field frequency in a '|' delimited file

Discussion started by: ChicagoBlues

10. UNIX for Dummies Questions & Answers

Count of Field for Non-Empty

Discussion started by: Swapna173