Sponsored Content
Top Forums Shell Programming and Scripting Number of matches and matched pattern(s) in awk Post 302963331 by durden_tyler on Monday 28th of December 2015 03:17:51 PM
Old 12-28-2015
Code:
$ 
$ cat -n f36
     1	!@#$%2QW5QWERTAB$%^&*
     2	!@#$%2QW7QWERTABXY3PQR$%^Z&*LMn#O
     3	$%$6ABcdEf)-2yg*%/LK@~~()
     4	3BHuI4RtYU2vGP
     5	!@#$%4AvDf2QWER
     6	##AAAABBBCCD##2RTC##=3XYZ##?4WWWW##3PQaR##=3XYZ#
     7	##AAAA#BBBvCCD##
     8	##AA#Bv#CC#4RSTUV#
     9	#?2AA#3XYZvN#4PQrsN#3XYZ=2wq#
    10	!^#=3AAAB$$?2CCR^&?2DD*=4EEEEY()
$ 
$ cat -n f36_v3.awk
     1	function fetch_num (s) {
     2	    # This function returns the number at the start of a string.
     3	    # If "2ABCD" is passed, then 2 is returned.
     4	    # If "23XYZW" is passed, then 23 is returned.
     5	    l = length(s);
     6	    num = "";
     7	    i = 0;
     8	    while (++i <= l && substr(s,i,1) ~ /[2-9]/) {
     9	        num = num""substr(s,i,1);
    10	    }
    11	    return num;
    12	}
    13	function join (a, kvsep, arrsep) {
    14	    # This function joins all elements of an associative array with arrsep.
    15	    # Each key/value pair of the array is joined with kvsep.
    16	    # a = associative array
    17	    # kvsep = separator between key/value pairs
    18	    # arrsep = separator between array elements
    19	    iter = 1;
    20	    result = "";
    21	    for (i in a) {
    22	       if (iter == 1) { result = a[i] kvsep i; }
    23	       else { result = result arrsep a[i] kvsep i; }
    24	       iter++;
    25	    }
    26	    return result;
    27	}
    28	{   # There are 4 associative arrays: s, m, q, e
    29	    # s => to store number of occurrences of single character patterns
    30	    # m => to store number of occurrences of multi-character patterns
    31	    # q => to store number of occurrences of patterns that follow "?" character
    32	    # e => to store number of occurrences of patterns that follow "=" character
    33	    str = $0;
    34	    ind = 1;
    35	    len = length(str);
    36	    printf("Input  : %s\n", str);
    37	    while (ind <= len) {
    38	        ch = substr(str, ind, 1);
    39	        if (ch == "?" && substr(str,ind+1,1) ~ /[2-9]/) { # Pattern following "?"
    40	            n = fetch_num(substr(str, ind+1));
    41	            q[substr(str, ind+2, n)]++;
    42	            ind += n + 2;
    43	        } else if (ch == "=" && substr(str,ind+1,1) ~ /[2-9]/) { # Pattern following "="
    44	            n = fetch_num(substr(str, ind+1));
    45	            e[substr(str, ind+2, n)]++;
    46	            ind += n + 2;
    47	        } else if (ch ~ /[2-9]/) { # Multi-character pattern
    48	            n = fetch_num(substr(str, ind));
    49	            m[substr(str, ind+1, n)]++;
    50	            ind += n + 1;
    51	        } else if (ch ~ /[A-Za-z]/) { # Single-character pattern
    52	            s[ch]++;
    53	            ind++;
    54	        } else {
    55	            ind++;
    56	        }
    57	    }
    58	    if (join(s,";"," ") != "") { s_str = join(s,";"," "); }
    59	    if (join(m,";","/") != "") { m_str = join(m,";","/"); }
    60	    if (join(q,";","|") != "") { q_str = join(q,";","|"); }
    61	    if (join(e,";","|") != "") { e_str = join(e,";","|"); }
    62	    printf("Output : %s => %s %s %s %s\n", str, s_str, m_str, q_str, e_str);
    63	    printf("\n");
    64	    # Flush all arrays and start over again
    65	    split("",s);
    66	    split("",m);
    67	    split("",q);
    68	    split("",e);
    69	}
    70	
$ 
$ awk -f f36_v3.awk f36
Input  : !@#$%2QW5QWERTAB$%^&*
Output : !@#$%2QW5QWERTAB$%^&* => 1;A 1;B 1;QWERT/1;QW  

Input  : !@#$%2QW7QWERTABXY3PQR$%^Z&*LMn#O
Output : !@#$%2QW7QWERTABXY3PQR$%^Z&*LMn#O => 1;O 1;n 1;X 1;L 1;Y 1;M 1;Z 1;QWERTAB/1;PQR/1;QW  

Input  : $%$6ABcdEf)-2yg*%/LK@~~()
Output : $%$6ABcdEf)-2yg*%/LK@~~() => 1;K 1;L 1;yg/1;ABcdEf  

Input  : 3BHuI4RtYU2vGP
Output : 3BHuI4RtYU2vGP => 1;P 1;I 1;RtYU/1;vG/1;BHu  

Input  : !@#$%4AvDf2QWER
Output : !@#$%4AvDf2QWER => 1;R 1;E 1;AvDf/1;QW  

Input  : ##AAAABBBCCD##2RTC##=3XYZ##?4WWWW##3PQaR##=3XYZ#
Output : ##AAAABBBCCD##2RTC##=3XYZ##?4WWWW##3PQaR##=3XYZ# => 4;A 3;B 3;C 1;D 1;R 1;PQa/1;RT 1;WWWW 2;XYZ

Input  : ##AAAA#BBBvCCD##
Output : ##AAAA#BBBvCCD## => 4;A 1;v 3;B 2;C 1;D 1;PQa/1;RT 1;WWWW 2;XYZ

Input  : ##AA#Bv#CC#4RSTUV#
Output : ##AA#Bv#CC#4RSTUV# => 2;A 1;v 1;B 2;C 1;V 1;RSTU 1;WWWW 2;XYZ

Input  : #?2AA#3XYZvN#4PQrsN#3XYZ=2wq#
Output : #?2AA#3XYZvN#4PQrsN#3XYZ=2wq# => 2;N 1;v 1;PQrs/2;XYZ 1;AA 1;wq

Input  : !^#=3AAAB$$?2CCR^&?2DD*=4EEEEY()
Output : !^#=3AAAB$$?2CCR^&?2DD*=4EEEEY() => 1;B 1;R 1;Y 1;PQrs/2;XYZ 1;CC|1;DD 1;AAA|1;EEEE

$ 
$

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to count pattern matches

i have an awk statement which i am using to count the number of occurences of the number ,5, in the file: awk '/,5,/ {count++}' TRY.txt | awk 'END { printf(" Total parts: %d",count)}' i know there is a total of 10 matches..what is wrong here? thanks (16 Replies)
Discussion started by: npatwardhan
16 Replies

2. Shell Programming and Scripting

awk to sum specific field when pattern matches

Trying to sum field #6 when field #2 matches string as follows: Input data: 2010-09-18-20.24.44.206117 UOWEXEC db2bp DB2XYZ hostname 1 2010-09-18-20.24.44.206117 UOWWAIT db2bp DB2XYZ hostname ... (3 Replies)
Discussion started by: ux4me
3 Replies

3. Shell Programming and Scripting

grep - match files containing minimum number of pattern matches

I want to search a bunch of files and list only those containing a minimum number of pattern matches. So if I want to identify files containing 3 (or more) instances of the pattern "said:" and I have file1 that contains the lines: He said: She said: and file2 that contains the lines: He... (3 Replies)
Discussion started by: stumpyuk
3 Replies

4. Shell Programming and Scripting

print the whole row in awk based on matched pattern

Hi, I need some help on how to print the whole data for unmatched pattern. i have 2 different files that need to be checked and print out the unmatched patterns into a new file. My sample data as follows:- File1.txt Id Num Activity Class Type 309 1.1 ... (5 Replies)
Discussion started by: redse171
5 Replies

5. Shell Programming and Scripting

awk with range but matches pattern

To match range, the command is: awk '/BEGIN/,/END/' but what I want is the range is printed only if there is additional pattern that matches in the range itself? maybe like this: awk '/BEGIN/,/END/ if only in that range there is /pattern/' Thanks (8 Replies)
Discussion started by: zorrox
8 Replies

6. Shell Programming and Scripting

Count number of pattern matches per line for all files in directory

I have a directory of files, each with a variable (though small) number of lines. I would like to go through each line in each file, and print the: -file name -line number -number of matches to the pattern /comp/ for each line. Two example files: cat... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

7. Shell Programming and Scripting

awk to delete content before and after a matched pattern

Hello, I have been trying to write a script where I could get awk to delete data before and after a matched pattern. For eg Raw data Start NAME = John Age = 35 Occupation = Programmer City = New York Certification Completed = No Salary = 80000 End Start NAME = Mary Age = 25... (2 Replies)
Discussion started by: sidnow
2 Replies

8. Shell Programming and Scripting

Egrep patterns in a file and limit number of matches to print for each pattern match

Hi I need to egrep patterns in a file and limit number of matches to print for each matched pattern. -m10 option is not working out in my sun solaris 5.10 Please guide me the options to achieve. if i do head -10 , i wont be getting all pattern match results as output since for a... (10 Replies)
Discussion started by: ananan
10 Replies

9. Shell Programming and Scripting

awk Index to get position matches pattern

Input data as below (filetest.txt): 1|22 JAN Minimum Bal 20.00 | SAT 2|09 FEB Extract bal 168.00BR | REM 3|MIN BAL | LEX Output should be: ( If there is Date & Month in 2nd field of Input file, It should be seperated else blank. If There is Decimal OR Decimal & Currency in last of the 2nd... (7 Replies)
Discussion started by: JSKOBS
7 Replies

10. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

I have a text file with many thousands of lines, a small sample of which looks like this: InputFile:PS002,003 D -1 5 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 6 6 -1 -1 -1 -1 0 509 0 PS002,003 PSQ 0 1 7 18 1 0 -1 1 1 3 -1 -1 ... (5 Replies)
Discussion started by: jvoot
5 Replies
All times are GMT -4. The time now is 07:40 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy