Grouping and counting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grouping and counting
# 8  
Old 08-15-2016
Quote:
Originally Posted by Nik44
Thanks Ravinder.
The command executes without any issues, however doesn't seem to have altered the file (is identical as to pre the command). Is this perhaps because my data includes literal double quotes?
I tried escaping them but doesn't seem to make any difference Smilie
Hello Nik44,

You could say Thanks to a person by hitting THANKS button at left of every post.
Above awk command will not put output into same Input_file, you should try following to do the same.
Code:
awk -F"|" 'FNR==NR{gsub(/\"/,X,$0);if($4==1 && $5=="Y"){A[$NF]++};next} ($NF in A){print $NF, A[$NF];delete A[$NF]}'  Input_file  Input_file > tmp_file
mv tmp_file  Input_file

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 9  
Old 08-16-2016
Quote:
Originally Posted by Nik44
Is there a way of refining this output further so that it will only count a record of that group where field four is "1" and where field five is "Y"?
Since you like the solution using 'cut': You could use grep before doing the cut, to return only the lines with the desired properties.
# 10  
Old 08-16-2016
Hi.
Quote:
Originally Posted by Nik44
Thanks for the reponses! I've gone for the more straight forward cut option as it provides the desired output.

However I forgot to add to this question... Is there a way of refining this output further so that it will only count a record of that group where field four is "1" and where field five is "Y"? I realise in the example supplied it would still show the exact same count.
You're welcome.

My rule of thumb is that when a specific content of a specific field needs to be examined, manipulated, etc., then I reach for awk first (perl second) because the field-separating facilities are very good.

If you can certify that the contents of fields 4 and 5 are unique to the content of all fields in a line, then you may be able to use the suggestion from rovf to use grep, because grep will consider the content of the entire line without regard to fields. Otherwise, an awk solution seems like the best approach.

Best wishes ... cheers, drl
# 11  
Old 08-16-2016
To retain order you may try this

Code:
akshay@localhost tmp]$ cat file
1234|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|AA
ABCD|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|AA
EFGH|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|CC
IJKLM|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|CC
EFGH|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|BB
IJKLM|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|BB
NOPQ|"ZZZ"|"Date"|"1"|"Y"|"ABC"|""|BB

Code:
[akshay@localhost tmp]$ awk -F\| '!($NF in c){d[++q]=$NF}{c[$NF]++}END{for(i=1; i in d; i++)print d[i],c[d[i]]}' file
AA 2
CC 2
BB 3

# 12  
Old 08-16-2016
Quote:
Originally Posted by drl
If you can certify that the contents of fields 4 and 5 are unique to the content of all fields in a line, then you may be able to use the suggestion from rovf to use grep, because grep will consider the content of the entire line without regard to fields.
Given that the field *separators* are unique, we can use grep even if the same field contents already occurs earlier in the line. It's just that we have to use extended regular expressions, and anchor our search at the start of the line.

Something like

Code:
grep -E '^([^|]+[|]){3}.1...Y'  .....

# 13  
Old 08-16-2016
You should do a forum search before opening a thread. Exactly your problem (and some of the suggested solutions) was discussed in length in this thread. It is called a "control break".

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 14  
Old 08-17-2016
Hi.

I have posted a comparison of the grep and awk approach that rovf and I briefly discussed here. Because it is not strictly on-point here, I have created a new thread at One instance of comparing grep and awk

cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with grouping and zipping

Hi can you please help with the below ? source file: Column1,Column2,Column3,Column4 abc,123,dir1/FXX/F19,1 abc,123,dir1/FXX/F20,1 abc,123,dir1/FXX/F23,2 abc,123,dir1/FXX/C25,2 abc,123,dir1/FXX/X25,2 abc,123,dir1/FXX/A23,3 abc,123,dir1/FXX/Z25,3 abc,123,dir1/FXX/Y25,4 I want to... (3 Replies)
Discussion started by: paul1234
3 Replies

2. Shell Programming and Scripting

Grouping and Calculating

Hi All, I want to read the input file and store the output in the Output file. I pasted the sample Input and Output file below. Help me with this. Input file ================================= ITEM1 AAAAA 1 ITEM1 BBBBB 1 ITEM1 CCCCC 1 ITEM2 AAAAA 5 ITEM2 CCCCC 4... (1 Reply)
Discussion started by: humaemo
1 Replies

3. Shell Programming and Scripting

Name grouping

awk 'FNR==NR {a; next} $NF in a' genes.txt refseq_exons.txt > output.txt I can not figure out how to group the same name in $4 together. Basically, all the SKI together in separate rows and all the TGFB2. Thank you :). chr1 2160133 2161174 SKI chr1 218518675 218520389 TGFB2... (1 Reply)
Discussion started by: cmccabe
1 Replies

4. Shell Programming and Scripting

UNIX grouping

Hi guys, I am a complete newbie to unix and have been tasked with creating a script to group the following data (file) by hourly slots so that I can count the transactions completed within the peak hour. I am not sure how to group data like this in unix. Can anyone please help? Here is an... (1 Reply)
Discussion started by: MrMidas
1 Replies

5. Shell Programming and Scripting

Grouping

Hi all, I am using following command: perl program.pl input.txt output.txt CUTOFF 3 > groups_3.txt containing program.pl, two files (input.txt, output.txt) and getting output in groups_3.txt: But, I wish to have 30 files corresponding to each CUTOFF ranging from 0 to 30 using the same... (1 Reply)
Discussion started by: bioinfo
1 Replies

6. UNIX for Dummies Questions & Answers

Grouping in grep

How do you do grouping in grep? Here's how I tried it at first: egrep 'qualit(y|ies)' /usr/share/dict/words -bash: syntax error near unexpected token `(' I'm using GNUgrep, and I found this on their site. grep regular expression syntax So I tried this: egrep 'qualit\(y\|ies\)'... (2 Replies)
Discussion started by: sudon't
2 Replies

7. Shell Programming and Scripting

Selective grouping

I have a text file in this format. Group: AAA Notes: IP : 11.11.11.11 #User xxxxxxxxx #Password aaaaaaaaaaaaaaaa Group: AAA Notes: IP : 11.11.11.22 #User yyyyyyyyyyyyy #Password bbbbbbbbbbbbb (8 Replies)
Discussion started by: anil510
8 Replies

8. UNIX for Advanced & Expert Users

grouping lines

Hi all, I have input lines like below: A;100;Paris;City;10;0;0 A;100;Paris;City;0;10;0 A;100;Paris;City in Europe;0;0;20 B;101;London;City;20;0;0 B;101;London;City;0;20;0 B;101;London;City in Europe;0;0;40 I need to group the above lines to: A;100;Paris;City in Europe;10;10;20... (4 Replies)
Discussion started by: andy2000
4 Replies

9. UNIX for Dummies Questions & Answers

Help with data grouping

Hi all, I have a set data as shown below, and i would like to eliminate the name that no children - boy and girl. What is the appropriate command can i use(other than grep)? Please assist... My input: name sex marital status children - boy children - girl ... (3 Replies)
Discussion started by: 793589
3 Replies

10. Shell Programming and Scripting

egrep and grouping

i am using the c shell on solaris. directories i'm working with: ls -1d DIV* DIV_dental/ DIV_ibc/ DIV_ifc/ DIV_index/ DIV_pharm/ DIV_sectionI/ DIV_sectionI-title/ DIV_sectionI-toc/ DIV_sectionII-title/ DIV_sectionII-toc/ DIV_standing/ DIV_standing-toc/ DIV_title/ DIV_vision/ (1 Reply)
Discussion started by: effigy
1 Replies
Login or Register to Ask a Question