awk to count occurrence of strings and loop for multiple columns Post: 302919158

Sponsored Content

Top Forums Programming awk to count occurrence of strings and loop for multiple columns Post 302919158 by Don Cragun on Sunday 28th of September 2014 05:42:51 PM

09-28-2014

Registered User

If I correctly understand what you're trying to do, maybe the folllowing is closer to what you want:

Code:

awk '
NR == 1 {
        # We have the 1st line in the input file...  Print output header...
        for(i = 2; i <= NF; i++)
                printf("%s count %s%s", $i, $1, i == NF ? "\n" : " ")
        # Save # of fields for use in END clause.
        nf = NF
        next
}
{       # For each remaining line in the input file...  Accumulate data...
        d[NR, 1] = $1
        for(i = 2; i <= NF; i++) {
                d[NR, i] = $i
                if(!((i, $i) in vc)) {
                        # Add new value for this field...
                        v[i, ++nv[i]] = $i
                        # If the # of values in this row is greater than the #
                        # of output rows to be produced, increment the # of
                        # output rows.
                        if(nv[i] > orows)
                                orows++
                }
                # Increment count of occurences of this value in this field.
                vc[i, $i]++
        }
}
END {   # Process accumulated data and print results...
        # For each output row to be printed...
        for(row = 1; row <= orows; row++) {
                # For each set of 3 columns to be printed for this row...
                for(field = 2; field <= nf; field++) {
                        if(row > nv[field]) {
                                # No data for this field for this row.
                                # Print dashes...
                                printf("- - -%s", field == nf ? "\n" : " ")
                                # Skip to next field.
                                continue
                        }
                        # Print field value and count columns for this set.
                        printf("%s %d ", v[field, row],
                                vc[field, v[field, row]])
                        # Print Column 1 values list for this set.
                        oc = 0
                        for(line = 2; oc < vc[field, v[field, row]]; line++) {
                                # If the data in this input line or this field
                                # is the string we are looking for, print the
                                # field 1 value for this line...
                                if(d[line, field] == v[field, row])
                                        printf("%s%s", d[line, 1], 
                                                ++oc < vc[field, v[field, row]]\
                                                ? "," : \
                                                field == nf ? "\n" : " ")
                        }
                }
        }
}' file

With the following in file:

Code:

col1 col2 col3 col4 col5 col6 col7
1 A C E A XX AA
3 D E G A XX AG
5 T T A A XX AC
6 D C A G XX AT
100 E C A A XX CA
5002 E G G G XX CC
99999 A C E A AB CG

it produces the output:

Code:

col2 count col1 col3 count col1 col4 count col1 col5 count col1 col6 count col1 col7 count col1
A 2 1,99999 C 4 1,6,100,99999 E 2 1,99999 A 5 1,3,5,100,99999 XX 6 1,3,5,6,100,5002 AA 1 1
D 2 3,6 E 1 3 G 2 3,5002 G 2 6,5002 AB 1 99999 AG 1 3
T 1 5 T 1 5 A 3 5,6,100 - - - - - - AC 1 5
E 2 100,5002 G 1 5002 - - - - - - - - - AT 1 6
- - - - - - - - - - - - - - - CA 1 100
- - - - - - - - - - - - - - - CC 1 5002
- - - - - - - - - - - - - - - CG 1 99999

Is something like this what you want?

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. Linux

To find multiple strings count in a file

I need to find the line count of multiple strings in a particular file. The strings are as follows: bmgcc bmgccftp bsmsftp bulkftp cctuneftp crbtftp crmpos cso gujhr I am doing manual grep for each of the string to find the line count. The command i am using right now is: grep mark...

2. Shell Programming and Scripting

Count occurance of multiple strings using grep command

How to grep multiple string occurance in input file using single grep command? I have below input file with many IDP, RRBE messages. Out put should have count of each messages. I have used below command but it is not working grep -cH "(sent IDP Request)(Recv RRBCSM)" *.txt ...

3. Shell Programming and Scripting

Copying of multiple columns of one table to another by mapping with particular strings.

Hi, I would like to copy some columns from a particular file by mapping with the string names. i am using the .csv file format. my one file consist of 100 of columns but i want only particular 4 columns such as ( First_name, Middle_name,Last_name & Stlc). but they are listed in many files...

4. Shell Programming and Scripting

Count no of occurrence of the strings based on column value

Can anyone help me to count number of occurrence of the strings based on column value. Say i have 300 files with 1000 record length from which i need to count the number of occurrence string which is existing from 213 to 219. Some may be unique and some may be repeated.

5. Shell Programming and Scripting

How to count the number of occurrence of words from multiple files?

File 1 aaa bbb ccc File 2 aaa xxx zzz bbb File 3 aaa bbb xxx Output:

6. UNIX for Dummies Questions & Answers

Split a column into multiple columns at certain character count

Hey everyone, I have an issue with a client that is passing me a list of values in one column, and occasionally the combination of all the values results in more than an 255 character string. My DB has a 255 character limit, so I am looking to take the column (comma delimited file), and if it...

7. Shell Programming and Scripting

Count occurrence of string in a column using awk

Hi, I want to count the occurrences of strings in a column and display as in example below: Input: get1 345 789 098 get2 567 982 090 fet4 777 610 632 get1 800 544 230 get1 600 788 451 get2 892 321 243 get1 673 111 235 fet3 789 220 278 fet4 768 222 341 output: 4 get1 345 789...

8. UNIX for Dummies Questions & Answers

[Solved] Awk: count occurrence of each character for every field

Hi, let's say an input looks like: A|C|C|D A|C|I|E A|B|I|C A|T|I|B as the title of the thread explains, I am trying to get something like: 1|A=4 2|C=2|B=1|T=1 3|I=3|C=1 4|D=1|E=1|C=1|B=1 i.e. a count of every character in each field (first column of output) independently, sorted...

9. UNIX for Dummies Questions & Answers

Count occurrence of string (based on type) in a column using awk

Hello, I have a table that looks like what is shown below: AA BB CC XY PQ RS AA BB CC XY RS I would like the total counts depending on the set they belong to: if search pattern is in {AA, BB, CC} --> count them as Type1 | wc -l

10. UNIX for Beginners Questions & Answers

Count multiple columns and print original file

Hello, I have two tab files with headers File1: with 4 columns header1 header2 header3 header4 44 a bb 1 57 c ab 4 64 d d 5 File2: with 26 columns header1.. header5 header6 header7 ... header 22...header26 id1 44 a bb id2 57 ...

10 More Discussions You Might Find Interesting

1. Linux

To find multiple strings count in a file

Discussion started by: salaathi

2. Shell Programming and Scripting

Count occurance of multiple strings using grep command

Discussion started by: sushmab82

3. Shell Programming and Scripting

Copying of multiple columns of one table to another by mapping with particular strings.

Discussion started by: dsh007

4. Shell Programming and Scripting

Count no of occurrence of the strings based on column value

Discussion started by: zooby

5. Shell Programming and Scripting

How to count the number of occurrence of words from multiple files?

Discussion started by: Misa-Misa

6. UNIX for Dummies Questions & Answers

Split a column into multiple columns at certain character count

Discussion started by: perekl

7. Shell Programming and Scripting

Count occurrence of string in a column using awk

Discussion started by: aydj

8. UNIX for Dummies Questions & Answers

[Solved] Awk: count occurrence of each character for every field

Discussion started by: beca123456

9. UNIX for Dummies Questions & Answers

Count occurrence of string (based on type) in a column using awk

Discussion started by: Gussifinknottle

10. UNIX for Beginners Questions & Answers

Count multiple columns and print original file

Discussion started by: nans