Grouping and counting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grouping and counting
# 15  
Old 08-17-2016
One awk command might be faster than a grep | cut | sort | uniq command chain.
The FS or -F is an ERE, so one can put two optional " in it.
Code:
file="inputfile"
awk -F '"?[|]"?' '($4==1 && $5=="Y") { A[$NF]++ } END { for (i in A) print i, A[i] }' $file > $file.new &&
mv $file.new $file

awk produces a new file. If sucessful the mv command replaces the input file with it.

Last edited by MadeInGermany; 08-17-2016 at 03:22 PM.. Reason: colored
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with grouping and zipping

Hi can you please help with the below ? source file: Column1,Column2,Column3,Column4 abc,123,dir1/FXX/F19,1 abc,123,dir1/FXX/F20,1 abc,123,dir1/FXX/F23,2 abc,123,dir1/FXX/C25,2 abc,123,dir1/FXX/X25,2 abc,123,dir1/FXX/A23,3 abc,123,dir1/FXX/Z25,3 abc,123,dir1/FXX/Y25,4 I want to... (3 Replies)
Discussion started by: paul1234
3 Replies

2. Shell Programming and Scripting

Grouping and Calculating

Hi All, I want to read the input file and store the output in the Output file. I pasted the sample Input and Output file below. Help me with this. Input file ================================= ITEM1 AAAAA 1 ITEM1 BBBBB 1 ITEM1 CCCCC 1 ITEM2 AAAAA 5 ITEM2 CCCCC 4... (1 Reply)
Discussion started by: humaemo
1 Replies

3. Shell Programming and Scripting

Name grouping

awk 'FNR==NR {a; next} $NF in a' genes.txt refseq_exons.txt > output.txt I can not figure out how to group the same name in $4 together. Basically, all the SKI together in separate rows and all the TGFB2. Thank you :). chr1 2160133 2161174 SKI chr1 218518675 218520389 TGFB2... (1 Reply)
Discussion started by: cmccabe
1 Replies

4. Shell Programming and Scripting

UNIX grouping

Hi guys, I am a complete newbie to unix and have been tasked with creating a script to group the following data (file) by hourly slots so that I can count the transactions completed within the peak hour. I am not sure how to group data like this in unix. Can anyone please help? Here is an... (1 Reply)
Discussion started by: MrMidas
1 Replies

5. Shell Programming and Scripting

Grouping

Hi all, I am using following command: perl program.pl input.txt output.txt CUTOFF 3 > groups_3.txt containing program.pl, two files (input.txt, output.txt) and getting output in groups_3.txt: But, I wish to have 30 files corresponding to each CUTOFF ranging from 0 to 30 using the same... (1 Reply)
Discussion started by: bioinfo
1 Replies

6. UNIX for Dummies Questions & Answers

Grouping in grep

How do you do grouping in grep? Here's how I tried it at first: egrep 'qualit(y|ies)' /usr/share/dict/words -bash: syntax error near unexpected token `(' I'm using GNUgrep, and I found this on their site. grep regular expression syntax So I tried this: egrep 'qualit\(y\|ies\)'... (2 Replies)
Discussion started by: sudon't
2 Replies

7. Shell Programming and Scripting

Selective grouping

I have a text file in this format. Group: AAA Notes: IP : 11.11.11.11 #User xxxxxxxxx #Password aaaaaaaaaaaaaaaa Group: AAA Notes: IP : 11.11.11.22 #User yyyyyyyyyyyyy #Password bbbbbbbbbbbbb (8 Replies)
Discussion started by: anil510
8 Replies

8. UNIX for Advanced & Expert Users

grouping lines

Hi all, I have input lines like below: A;100;Paris;City;10;0;0 A;100;Paris;City;0;10;0 A;100;Paris;City in Europe;0;0;20 B;101;London;City;20;0;0 B;101;London;City;0;20;0 B;101;London;City in Europe;0;0;40 I need to group the above lines to: A;100;Paris;City in Europe;10;10;20... (4 Replies)
Discussion started by: andy2000
4 Replies

9. UNIX for Dummies Questions & Answers

Help with data grouping

Hi all, I have a set data as shown below, and i would like to eliminate the name that no children - boy and girl. What is the appropriate command can i use(other than grep)? Please assist... My input: name sex marital status children - boy children - girl ... (3 Replies)
Discussion started by: 793589
3 Replies

10. Shell Programming and Scripting

egrep and grouping

i am using the c shell on solaris. directories i'm working with: ls -1d DIV* DIV_dental/ DIV_ibc/ DIV_ifc/ DIV_index/ DIV_pharm/ DIV_sectionI/ DIV_sectionI-title/ DIV_sectionI-toc/ DIV_sectionII-title/ DIV_sectionII-toc/ DIV_standing/ DIV_standing-toc/ DIV_title/ DIV_vision/ (1 Reply)
Discussion started by: effigy
1 Replies
Login or Register to Ask a Question
sortbib(1)						      General Commands Manual							sortbib(1)

Name
       sortbib - sort bibliographic database

Syntax
       sortbib [-sKEYS] database...

Description
       The  command sorts files of records containing refer key-letters by user-specified keys.  Records may be separated by blank lines, or by .[
       and .] delimiters, but the two styles may not be mixed together.  This program reads through each database and pulls out key fields,  which
       are  sorted  separately.   The sorted key fields contain the file pointer, byte offset, and length of corresponding records.  These records
       are delivered using disk seeks and reads, so may not be used in a pipeline to read standard input.

       By default, alphabetizes by the first %A and the %D fields, which contain the senior author and date.  The -s option is used to specify new
       KEYS.  For instance, -sATD will sort by author, title, and date, while -sA+D will sort by all authors, and date.  Sort keys past the fourth
       are not meaningful.  No more than 16 databases may be sorted together at one time.  Records longer than 4096 characters will be truncated.

       The command sorts on the last word on the %A line, which is assumed to be the author's last name.  A word in the final  position,  such	as
       ``jr.''	or  ``ed.'',  will be ignored if the name beforehand ends with a comma.  Authors with two-word last names or unusual constructions
       can be sorted correctly by using the convention ``'' in place of a blank.  A %Q field is considered to be the same as %A, except  sorting
       begins  with  the first, not the last, word.  The command sorts on the last word of the %D line, usually the year.  It also ignores leading
       articles (like ``A'' or ``The'') when sorting by titles in the %T or %J fields; it will ignore articles of any  modern  European  language.
       If a sort-significant field is absent from a record, places that record before other records containing that field.

Options
       -sKEYS
	  Specifies new sort KEYS.  For example, ATD sorts by author, title, and date.

See Also
       addbib(1), indxbib(1), lookbib(1), refer(1), roffbib(1)

																	sortbib(1)