How to count pattern in column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to count pattern in column
# 8  
Old 12-04-2007
Quote:
Originally Posted by vgersh99
Code:
awk -F'[`$]' ' {print $1, NF-1}' file

Maybe I misunderstood the question, but he want to count more characters e.g.:

file:
Code:
Line1`abc$bn`$$

Output:

Code:
Line1: 2 "`" and 3 "$"

Regards
# 9  
Old 12-04-2007
I guess we'll need the OP to clarify what needs to be done with the example of the input and the desired output!
# 10  
Old 12-04-2007
Hi there,

My question is actually want to count different characters in one line; let say "`" , "$" ,"%" etc.

What Franklin understanding is correct:-

Inside the file:-

Line1`abc$bn`$$

Output of the file

Line1: 2 "`" and 3 "$"


What other way can I used to implement these functions? Please advise. Thanks.


Rgrds,
Jason
# 11  
Old 12-04-2007
something along these lines:

nawk -f ah.awk myFile

ah.awk:
Code:
BEGIN {
  # if the variable 'var' has not been assigned to, assign a default value of
  # '%$`' to it.
  # The variable 'var' contains a list of characters to frequency usage for.
  if (pat=="")
    pat="%$`"

  # substitute every single character in 'var' with ' [char] ' string:
  # '%' -> ' [%] '; '$' -> ' [$] ' etc...
  gsub(/./, " [&] ", pat)

  # split string in 'pat' by ' ' and store results in array 'patA' (parN - holds the number of entries the patA array):
  # patA[1]='[%]'; patA[2]='[$]' etc....
  patN=split(pat, patA, " ")
}
{
  # print the line number - NF holds the current line number 
  printf("Line %d:", FNR)

  # iterate through ALL the entries ( 1 ---> patN) in patA array
  # for every entry in the array, substitute (gsub) the current entry if found in the current line ($0) with 
  # the empty string "" (nothing). The substitution/gsub returns the NUMBER of the successful substitutions
  # This is the number a particular character/string/regex appears in a given string ($0 - your record/line).
  for(i=1; i <= patN; i++)
     printf(" %d '%s' %s", gsub(patA[i], "", $0), patA[i], (i==patN) ? "\n" : "")
}

or to define your own set of chars to count:
nawk -v pat='^&*#' -f ah.awk myFile

Last edited by vgersh99; 12-04-2007 at 08:06 PM..
# 12  
Old 12-04-2007
Hi,

Could you please explain to me what is the context of :-

nawk -v pat='^&*#' -f ah.awk myFile

Where do ah.awk comes? I assume "myFile" is the existing shell script.

And honeslty the code you shown me just stunned me. Correct me if I am wrong.

From my understanding,the first thing is to compare the blank spaces, and insert the patterns to pat array?

From the pat array which contain [%,$,`] , then you split the pat array to pat2 which holds the number of elements of pattern.

From these number of patterns, then what I could not really understood is:-

for(i=1; i <= patN; i++)
printf(" %d '%s' %s", gsub(patA[i], "", $0), patA[i], (i==patN) ? "\n" : "")

In this for loop, from 1 to number of lines in the file, you are trying to somehow check the pattern? But I do not really understand especially on
gsub(patA[i], "", $0), patA[i], (i==patN) ? "\n" : "") doing. Perhaps you could help to shed some light on this.

Sorry as I am quite new to the shell programming.

P/s: Is there any other ideas which we could use just awk and for loop without using the "ah.awk"?


Rgrds,
Jason
# 13  
Old 12-04-2007
Quote:
Originally Posted by ahjiefreak
Hi,

Could you please explain to me what is the context of :-

nawk -v pat='^&*#' -f ah.awk myFile
-v pat='^&*#' - defines an awk variable called 'pat'. I assigned a string '^&*#' to that variable. This variable will contain ALL characters you want to get the frequency stat for.
Quote:
Originally Posted by ahjiefreak
Where do ah.awk comes?
Sorry - I forgot to show this in my post - modified the post now.
'ah.awk' is an awk script from original post - that's the meat of implementation.
Quote:
Originally Posted by ahjiefreak
I assume "myFile" is the existing shell script.
no, 'myFile' is the file you need to parse and find the character frequencies per line.
Quote:
Originally Posted by ahjiefreak
And honeslty the code you shown me just stunned me. Correct me if I am wrong.
Instead commenting on your comments.... I'll document the code in the original post.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

If pattern in column 3 matches pattern in column 2 (any row), print value in column 1

Hi all, I have searched and searched, but I have not found a solution that quite fits what I am trying to do. I have a long list of data in three columns. Below is a sample: 1,10,8 2,12,10 3,13,12 4,14,14 5,15,16 6,16,18 Please use code tags What I need to do is as follows: If a... (4 Replies)
Discussion started by: bleedingturnip
4 Replies

2. Shell Programming and Scripting

Column 2 string count in Column 3

My I/p is Col1|Col2|Col3 2116209997932|POSIX INC|POSIX 2116209997933|POSIX INC|POSIX 2116210089479|POSIX INC|POSIX 2116210180502|POSIX INC|POSIX 2116210512279|POSIX INC|Aero 2116210516838|POSIX INC|POSIX 2116210534342|POSIX INC|postal 2116210534345|POSIX INC|postal ... (6 Replies)
Discussion started by: nikhil jain
6 Replies

3. Shell Programming and Scripting

Read first column and count lines in second column using awk

Hello all, I would like to ask your help here: I've a huge file that has 2 columns. A part of it is: sorted.txt: kss23 rml.67lkj kss23 zhh.6gf kss23 nhd.09.fdd kss23 hp.767.88.89 fl67 nmdsfs.56.df.67 fl67 kk.fgf.98.56.n fl67 bgdgdfg.hjj.879.d fl66 kl..hfh.76.ghg fl66... (5 Replies)
Discussion started by: Padavan
5 Replies

4. Shell Programming and Scripting

Identify file pattern, take count of pattern, then act

Guys - Need your ideas on a section of code to finish something up. To make a long story short, I'm parsing a print output file that goes to pre-printed forms. I'm intercepting it, parsing it, formatting it, cutting it up into individual pages, grabbing the text I want in zones, building an... (3 Replies)
Discussion started by: ampsys
3 Replies

5. Shell Programming and Scripting

awk pattern match and count unique in column

Hi all I have a need of searching some pattern in file by month and then count unique records D11 G11 R11 -------> Pattern available in file S11 Jan$1 to $5 column contains some records in which I want to find unique for this purpose I have written script like below awk '/Jan/ ||... (4 Replies)
Discussion started by: nex_asp
4 Replies

6. Shell Programming and Scripting

Search for a pattern in a String file and count the occurance of each pattern

I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported Input file is a free flowing file without any format example of output ERR-00001=5 .... ERR-01010=10 ..... ERR-99999=10 (4 Replies)
Discussion started by: swayam123
4 Replies

7. Shell Programming and Scripting

Replace column that matches specific pattern, with column data from another file

Can anyone please help with this? I have 2 files as given below. If 2nd column of file1 has pattern foo1@a, find the matching 1st column in file2 & replace 2nd column of file1 with file2's value. file1 abc_1 foo1@a .... abc_1 soo2@a ... def_2 soo2@a .... def_2 foo1@a ........ (7 Replies)
Discussion started by: prashali
7 Replies

8. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Given a file such as this I need to remove the duplicates. 00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt 00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt 0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt 0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies

9. Shell Programming and Scripting

Count the number of occurrences of a pattern between each occurrence of a different pattern

I need to count the number of occurrences of a pattern, say 'key', between each occurrence of a different pattern, say 'lu'. Here's a portion of the text I'm trying to parse: lu S1234L_149_m1_vg.6, part-att 1, vdp-att 1 p-reserver IID 0xdb registrations: key 4156 4353 0000 0000 ... (3 Replies)
Discussion started by: slipstream
3 Replies

10. Shell Programming and Scripting

nawk-how count the number of occurances of a pattern, when don't know the pattern

I've written a script to count the total size of SAN storage LUNs, and also display the LUN sizes. From server to server, the LUNs sizes differ. What I want to do is count the occurances as they occur and change. These are the LUN sizes: 49.95 49.95 49.95 49.95 49.95 49.95 49.95 49.95... (2 Replies)
Discussion started by: cyber111
2 Replies
Login or Register to Ask a Question