Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Search and Count Occurrences of Pattern in a File Post 302322261 by tektips on Wednesday 3rd of June 2009 08:25:14 AM
Old 06-03-2009
Search and Count Occurrences of Pattern in a File

I need to search and count the occurrences of a pattern in a file. The catch here is it's a pattern and not a word ( not necessarily delimited by spaces). For eg. if ABCD is the pattern I need to search and count, it can come in all flavors like (ABCD, ABCD), XYZ.ABCD=100, XYZ.ABCD>=500, XYZ.ABCD = 200 etc.

I tried using something like below for Word search and count ( got if from another post trying to count occurrences of word , but not sure how I could fit this in for a string which is not necessarily delimited.

Code:
srchWord="$1"
srchFilename="$2"
ctr=0
    while read strLine
    do
    for eachWord in `echo $strLine`
    do
    if [ "$eachWord" = "$srchWord" ]
    then
    ctr=`expr $ctr + 1`
    fi
    done
    done < srchFilename
    printf "\n\ntheFile contains %s : %d times\n\n" $srchWord $ctr

Example of file contents and two specific search scenarios I am trying to address
Code:
(
PABC_CUST_ACCT_DETL_CURR.ADB_STFC_BAL<=5000
OR
PABC_CUST_ACCT_DETL_CURR.ADB_STFC_BAL is NULL
)
lab_may09_params_tbl.AMF <= 59)
((PABC.CUST_ACCT_DETL_CURR.ADB_STFC_BAL <= 5000) OR PABC.CUST_ACCT_DETL_CURR.ADB_STFC_BAL IS NULL) 
(ADB_STFC_BAL=100
ADB_STFC_BAL)
PABC_CUST_ACCT_DETL_CURR.ADB_STFC_BAL^M
PABC_CUST_ACCT_DETL_CURR.ADB_STFC_BAL;
(ADB_STFC_BAL)^M

Scenarios which I need to address:-

1. Search by ADB_STFC_BAL
Expected Result : Count 9

2. Search by PABC_CUST_ACCT_DETL_CURR.ADB_STFC_BAL
Expected Result : Count 6

Any help with altering the above script to use the pattern or any ather approaches to solve the problem using awk or so are greatly appreciated. Files are pretty large and I need to do this for around 200 words.

Thanks in advance !
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

pattern search and count

i want to search a word in a file and find the count of occurences even if pattern occures twice in a same line. for example file has the following content. yes no no nooo yees no yes if I search for "no" it should give count as 4 Pls help. Thanks (9 Replies)
Discussion started by: RahulJoshi
9 Replies

2. Shell Programming and Scripting

Count the number of occurrences of a pattern between each occurrence of a different pattern

I need to count the number of occurrences of a pattern, say 'key', between each occurrence of a different pattern, say 'lu'. Here's a portion of the text I'm trying to parse: lu S1234L_149_m1_vg.6, part-att 1, vdp-att 1 p-reserver IID 0xdb registrations: key 4156 4353 0000 0000 ... (3 Replies)
Discussion started by: slipstream
3 Replies

3. Shell Programming and Scripting

Pattern search and count

Hi all, I need to search the database log find out the most frequently used tables for a certain period of time. The search pattern is : the database.table so, i need to look for ABCD.* in the entire log and then need the top ten tables. I thought of using awk, search for the pattern ... (7 Replies)
Discussion started by: ysvsr1
7 Replies

4. Shell Programming and Scripting

How to search number of occurrences of a particular string in a file through vi editor?

i have one file, i am doing 'vi Filename' now i want to search for particular string and i want to know how many times that string occurs in whole file (5 Replies)
Discussion started by: sheelsadan
5 Replies

5. Shell Programming and Scripting

Search for a pattern in a String file and count the occurance of each pattern

I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported Input file is a free flowing file without any format example of output ERR-00001=5 .... ERR-01010=10 ..... ERR-99999=10 (4 Replies)
Discussion started by: swayam123
4 Replies

6. Shell Programming and Scripting

Speed : awk command to count the occurrences of fields from one file present in the other file

Hi, file1.txt AAA BBB CCC DDD file2.txt abc|AAA|AAAabcbcs|fnwufnq bca|nwruqf|AAA|fwfwwefwef fmimwe|BBB|fnqwufw|wufbqw wcdbi|CCC|wefnwin|wfwwf DDD|wabvfav|wqef|fwbwqfwfe i need the count of rows of file1.txt present in the file2.txt required output: AAA 2 (10 Replies)
Discussion started by: mdkm
10 Replies

7. Shell Programming and Scripting

Identify file pattern, take count of pattern, then act

Guys - Need your ideas on a section of code to finish something up. To make a long story short, I'm parsing a print output file that goes to pre-printed forms. I'm intercepting it, parsing it, formatting it, cutting it up into individual pages, grabbing the text I want in zones, building an... (3 Replies)
Discussion started by: ampsys
3 Replies

8. Shell Programming and Scripting

Count occurrences in first column

input amex-11 10 abc amex-11 20 bcn amed-12 1 abc I tried something like this. awk '{h++}; END { for(k in h) print k, h }' rm1 output amex-11 1 10 abc amex-11 1 20 bcn amed-12 2 1 abc Note: The second column represents the occurrences. amex-11 is first one and amed-12 is the... (5 Replies)
Discussion started by: quincyjones
5 Replies

9. Shell Programming and Scripting

awk variable search and line count between variable-search pattern

Input: |Running the Rsync|Sun Oct 16 22:48:01 BST 2016 |End of the Rsync|Sun Oct 16 22:49:54 BST 2016 |Running the Rsync|Sun Oct 16 22:54:01 BST 2016 |End of the Rsync|Sun Oct 16 22:55:45 BST 2016 |Running the Rsync|Sun Oct 16 23:00:02 BST 2016 |End of the Rsync|Sun Oct 16 23:01:44 BST 2016... (4 Replies)
Discussion started by: busyboy
4 Replies
Term::Sk(3pm)						User Contributed Perl Documentation					     Term::Sk(3pm)

NAME
Term::Sk - Perl extension for displaying a progress indicator on a terminal. SYNOPSIS
use Term::Sk; my $ctr = Term::Sk->new('%d Elapsed: %8t %21b %4p %2d (%8c of %11m)', {quiet => 0, freq => 10, base => 0, target => 100, pdisp => '!'}); $ctr->up for (1..100); $ctr->down for (1..100); $ctr->whisper('abc'); my last_line = $ctr->get_line; $ctr->close; print "Number of ticks: ", $ctr->ticks, " "; EXAMPLES
Term::Sk is a class to implement a progress indicator ("Sk" is a short form for "Show Key"). This is used to provide immediate feedback for long running processes. A sample code fragment that uses Term::Sk: use Term::Sk; print qq{This is a test of "Term::Sk" }; my $target = 2_845; my $format = '%2d Elapsed: %8t %21b %4p %2d (%8c of %11m)'; my $ctr = Term::Sk->new($format, {freq => 10, base => 0, target => $target, pdisp => '!'}); for (1..$target) { $ctr->up; do_something(); } $ctr->close; sub do_something { my $test = 0; for my $i (0..10_000) { $test += sin($i) * cos($i); } } Another example that counts upwards: use Term::Sk; my $format = '%21b %4p'; my $ctr = Term::Sk->new($format, {freq => 's', base => 0, target => 70}); for (1..10) { $ctr->up(7); sleep 1; } $ctr->close; At any time, after Term::Sk->new(), you can query the number of ticks (i.e. number of calls to $ctr->up or $ctr->down) using the method 'ticks': use Term::Sk; my $ctr = Term::Sk->new('%6c', {freq => 's', base => 0, target => 70}); for (1..4288) { $ctr->up; } $ctr->close; print "Number of ticks: ", $ctr->ticks, " "; This example uses a simple progress bar in quiet mode (nothing is printed to STDOUT), but instead, the content of what would have been printed can now be extracted using the get_line() method: use Term::Sk; my $format = 'Ctr %4c'; my $ctr = Term::Sk->new($format, {freq => 2, base => 0, target => 10, quiet => 1}); my $line = $ctr->get_line; $line =~ s/10/</g; print "This is what would have been printed upon new(): [$line] "; for my $i (1..10) { $ctr->up; $line = $ctr->get_line; $line =~ s/10/</g; print "This is what would have been printed upon $i. call to up(): [$line] "; } $ctr->close; $line = $ctr->get_line; $line =~ s/10/</g; print "This is what would have been printed upon close(): [$line] "; Here are some examples that show different values for option {num => ...} my $format = 'act %c max %m'; my $ctr1 = Term::Sk->new($format, {base => 1234567, target => 2345678}); # The following numbers are shown: act 1_234_567 max 2_345_678 my $ctr2 = Term::Sk->new($format, {base => 1234567, target => 2345678, num => q{9,999}}); # The following numbers are shown: act 1,234,567 max 2,345,678 my $ctr3 = Term::Sk->new($format, {base => 1234567, target => 2345678, num => q{9'99}}); # The following numbers are shown: act 1'23'45'67 max 2'34'56'78 my $ctr4 = Term::Sk->new($format, {base => 1234567, target => 2345678, num => q{9}}); # The following numbers are shown: act 1234567 max 2345678 my $ctr5 = Term::Sk->new($format, {base => 1234567, target => 2345678, commify => sub{ join '!', split m{}xms, $_[0]; }}); # The following numbers are shown: act 1!2!3!4!5!6!7 max 2!3!4!5!6!7!8 DESCRIPTION
Format strings The first parameter to new() is the format string which contains the following special characters: characters '%d' a revolving dash, format '/-|' characters '%t' time elapsed, format 'hh:mm:ss' characters '%b' progress bar, format '#####_____' characters '%p' Progress in percentage, format '999%' characters '%c' Actual counter value (commified by '_'), format '99_999_999' characters '%m' Target maximum value (commified by '_'), format '99_999_999' characters '%k' Token which updates its value before being displayed. An example use of this would be a loop wherein every step of the loop could be identified by a particular string. For example: my $ctr = Term::Sk->new('Processing %k counter %c', {base => 0, token => 'Albania'}); foreach my $country (@list_of_european_nations) { $ctr->token($country); for (1..500) { $ctr->up; ## do something... } }; $ctr->close; You can also have more than one token on a single line. Here is an example: my $ctr = Term::Sk->new('Processing %k Region %k counter %c', {base => 0, token => ['Albania', 'South']}); foreach my $country (@list_of_european_nations) { $ctr->token([$country, 'North']); for (1..500) { $ctr->up; ## do something... } }; $ctr->close; The "token" method is used to update the token value. If '%k' is used, then the counter must be instantiated with an intial value for the token. characters '%P' The '%' character itself Options The second parameter are the following options: option {freq => 999} This option sets the refresh-frequency on STDOUT to every 999 up() or down() calls. If {freq => 999} is not specified at all, then the refresh-frequency is set by default to every up() or down() call. option {freq => 's'} This is a special case whereby the refresh-frequency on STDOUT is set to every second. option {freq => 'd'} This is a special case whereby the refresh-frequency on STDOUT is set to every 1/10th of a second. option {base => 0} This specifies the base value from which to count. The default is 0 option {target => 10_000} This specifies the maximum value to which to count. The default is 10_000. option {pdisp => '!'} This option (with the exclamation mark) is obsolete and has no effect whatsoever. The progressbar will always be displayed using the hash-symbol "#". option {quiet => 1} This option disables most printing to STDOUT, but the content of the would be printed line is still available using the method get_line(). The whisper-method, however, still shows its output. The default is in fact {quiet => !-t STDOUT} option {num => '9_999'} This option configures the output number format for the counters. option {commify => sub{...}} This option allows one to register a subroutine that formats the counters. option {test => 1} This option is used for testing purposes only, it disables all printing to STDOUT, even the whisper shows no output. But again, the content of the would be printed line is still available using the method get_line(). Processing The new() method immediately displays the initial values on screen. From now on, nothing must be printed to STDOUT and/or STDERR. However, you can write to STDOUT during the operation using the method whisper(). We can either count upwards, $ctr->up, or downwards, $ctr->down. Everytime we do so, the value is either incremented or decremented and the new value is replaced on STDOUT. We should do so regularly during the process. Both methods, $ctr->up(99) and $ctr->down(99) can take an optional argument, in which case the value is incremented/decremented by the specified amount. When our process has finished, we must close the counter ($ctr->close). By doing so, the last displayed value is removed from STDOUT, as if nothing had happened. Now we are allowed to print again to STDOUT and/or STDERR. Post hoc transformation In some cases it makes sense to redirected STDOUT to a flat file. In this case, the backspace characters remain in the flat file. There is a function "rem_backspace()" that removes the backspaces (including the characters that they are supposed to remove) from a redirected file. Here is a simplified example: use Term::Sk qw(rem_backspace); my $flatfile = "Test hijabc101010xyzklmttt1010yzz"; printf "before (len=%3d): '%s' ", length($flatfile), $flatfile; rem_backspace($flatfile); printf "after (len=%3d): '%s' ", length($flatfile), $flatfile; AUTHOR
Klaus Eichner, January 2008 COPYRIGHT AND LICENSE
Copyright (C) 2008-2011 by Klaus Eichner This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.12.4 2011-07-19 Term::Sk(3pm)
All times are GMT -4. The time now is 08:19 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy