Grouping and counting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grouping and counting
# 15  
Old 08-17-2016
One awk command might be faster than a grep | cut | sort | uniq command chain.
The FS or -F is an ERE, so one can put two optional " in it.
Code:
file="inputfile"
awk -F '"?[|]"?' '($4==1 && $5=="Y") { A[$NF]++ } END { for (i in A) print i, A[i] }' $file > $file.new &&
mv $file.new $file

awk produces a new file. If sucessful the mv command replaces the input file with it.

Last edited by MadeInGermany; 08-17-2016 at 03:22 PM.. Reason: colored
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with grouping and zipping

Hi can you please help with the below ? source file: Column1,Column2,Column3,Column4 abc,123,dir1/FXX/F19,1 abc,123,dir1/FXX/F20,1 abc,123,dir1/FXX/F23,2 abc,123,dir1/FXX/C25,2 abc,123,dir1/FXX/X25,2 abc,123,dir1/FXX/A23,3 abc,123,dir1/FXX/Z25,3 abc,123,dir1/FXX/Y25,4 I want to... (3 Replies)
Discussion started by: paul1234
3 Replies

2. Shell Programming and Scripting

Grouping and Calculating

Hi All, I want to read the input file and store the output in the Output file. I pasted the sample Input and Output file below. Help me with this. Input file ================================= ITEM1 AAAAA 1 ITEM1 BBBBB 1 ITEM1 CCCCC 1 ITEM2 AAAAA 5 ITEM2 CCCCC 4... (1 Reply)
Discussion started by: humaemo
1 Replies

3. Shell Programming and Scripting

Name grouping

awk 'FNR==NR {a; next} $NF in a' genes.txt refseq_exons.txt > output.txt I can not figure out how to group the same name in $4 together. Basically, all the SKI together in separate rows and all the TGFB2. Thank you :). chr1 2160133 2161174 SKI chr1 218518675 218520389 TGFB2... (1 Reply)
Discussion started by: cmccabe
1 Replies

4. Shell Programming and Scripting

UNIX grouping

Hi guys, I am a complete newbie to unix and have been tasked with creating a script to group the following data (file) by hourly slots so that I can count the transactions completed within the peak hour. I am not sure how to group data like this in unix. Can anyone please help? Here is an... (1 Reply)
Discussion started by: MrMidas
1 Replies

5. Shell Programming and Scripting

Grouping

Hi all, I am using following command: perl program.pl input.txt output.txt CUTOFF 3 > groups_3.txt containing program.pl, two files (input.txt, output.txt) and getting output in groups_3.txt: But, I wish to have 30 files corresponding to each CUTOFF ranging from 0 to 30 using the same... (1 Reply)
Discussion started by: bioinfo
1 Replies

6. UNIX for Dummies Questions & Answers

Grouping in grep

How do you do grouping in grep? Here's how I tried it at first: egrep 'qualit(y|ies)' /usr/share/dict/words -bash: syntax error near unexpected token `(' I'm using GNUgrep, and I found this on their site. grep regular expression syntax So I tried this: egrep 'qualit\(y\|ies\)'... (2 Replies)
Discussion started by: sudon't
2 Replies

7. Shell Programming and Scripting

Selective grouping

I have a text file in this format. Group: AAA Notes: IP : 11.11.11.11 #User xxxxxxxxx #Password aaaaaaaaaaaaaaaa Group: AAA Notes: IP : 11.11.11.22 #User yyyyyyyyyyyyy #Password bbbbbbbbbbbbb (8 Replies)
Discussion started by: anil510
8 Replies

8. UNIX for Advanced & Expert Users

grouping lines

Hi all, I have input lines like below: A;100;Paris;City;10;0;0 A;100;Paris;City;0;10;0 A;100;Paris;City in Europe;0;0;20 B;101;London;City;20;0;0 B;101;London;City;0;20;0 B;101;London;City in Europe;0;0;40 I need to group the above lines to: A;100;Paris;City in Europe;10;10;20... (4 Replies)
Discussion started by: andy2000
4 Replies

9. UNIX for Dummies Questions & Answers

Help with data grouping

Hi all, I have a set data as shown below, and i would like to eliminate the name that no children - boy and girl. What is the appropriate command can i use(other than grep)? Please assist... My input: name sex marital status children - boy children - girl ... (3 Replies)
Discussion started by: 793589
3 Replies

10. Shell Programming and Scripting

egrep and grouping

i am using the c shell on solaris. directories i'm working with: ls -1d DIV* DIV_dental/ DIV_ibc/ DIV_ifc/ DIV_index/ DIV_pharm/ DIV_sectionI/ DIV_sectionI-title/ DIV_sectionI-toc/ DIV_sectionII-title/ DIV_sectionII-toc/ DIV_standing/ DIV_standing-toc/ DIV_title/ DIV_vision/ (1 Reply)
Discussion started by: effigy
1 Replies
Login or Register to Ask a Question
DUMP(5) 							File Formats Manual							   DUMP(5)

NAME
dump, ddate - incremental dump format SYNOPSIS
#include <sys/types.h> #include <sys/ino.h> # include <dumprestor.h> DESCRIPTION
Tapes used by dump and restor(1) contain: a header record two groups of bit map records a group of records describing directories a group of records describing files The format of the header record and of the first record of each description as given in the include file <dumprestor.h> is: NTREC is the number of 512 byte records in a physical tape block. MLEN is the number of bits in a bit map word. MSIZ is the number of bit map words. The TS_ entries are used in the c_type field to indicate what sort of header this is. The types and their meanings are as follows: TS_TAPE Tape volume label TS_INODE A file or directory follows. The c_dinode field is a copy of the disk inode and contains bits telling what sort of file this is. TS_BITS A bit map follows. This bit map has a one bit for each inode that was dumped. TS_ADDR A subrecord of a file description. See c_addr below. TS_END End of tape record. TS_CLRI A bit map follows. This bit map contains a zero bit for all inodes that were empty on the file system when dumped. MAGIC All header records have this number in c_magic. CHECKSUM Header records checksum to this value. The fields of the header structure are as follows: c_type The type of the header. c_date The date the dump was taken. c_ddate The date the file system was dumped from. c_volume The current volume number of the dump. c_tapea The current number of this (512-byte) record. c_inumber The number of the inode being dumped if this is of type TS_INODE. c_magic This contains the value MAGIC above, truncated as needed. c_checksum This contains whatever value is needed to make the record sum to CHECKSUM. c_dinode This is a copy of the inode as it appears on the file system; see filsys(5). c_count The count of characters in c_addr. c_addr An array of characters describing the blocks of the dumped file. A character is zero if the block associated with that character was not present on the file system, otherwise the character is non-zero. If the block was not present on the file system, no block was dumped; the block will be restored as a hole in the file. If there is not sufficient space in this record to describe all of the blocks in a file, TS_ADDR records will be scattered through the file, each one picking up where the last left off. Each volume except the last ends with a tapemark (read as an end of file). The last volume ends with a TS_END record and then the tape- mark. The structure idates describes an entry of the file /etc/ddate where dump history is kept. The fields of the structure are: id_name The dumped filesystem is `/dev/id_nam'. id_incno The level number of the dump tape; see dump(1). id_ddate The date of the incremental dump in system format see types(5). FILES
/etc/ddate SEE ALSO
dump(1), dumpdir(1), restor(1), filsys(5), types(5) DUMP(5)