Capture values using multiple regex patterns


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Capture values using multiple regex patterns
# 1  
Old 03-10-2010
Capture values using multiple regex patterns

I have to read the file, in each line of file i need to get 2 values using more than one search pattern.
ex: <0112 02:12:20 def > /some string/some string|[match string]|some string||124

i donot have same delimiter in the line, I have to read '0112 02:12:20' which is timestamp, and last field '124' which is the value I use to sum it to get the average value for the match string during that time frame. I have to use regex to match the line using 'match string; and also the timestamp because I want to get the lines between some time frame(like I want to read only the lines with timestamp between yesterday 6pm to today 4pm).
After I got the values of all the match string last field values then I will add it all last field values to get the average for that match string.
# 2  
Old 03-10-2010
egrep, using "-char as your regex list wrapper:

Code:
ps -ef |egrep "PID|foobar"

# 3  
Old 03-10-2010
Quote:
Originally Posted by adars1
I have to read the file, in each line of file i need to get 2 values using more than one search pattern.
ex: <0112 02:12:20 def > /some string/some string|[match string]|some string||124

i donot have same delimiter in the line, I have to read '0112 02:12:20' which is timestamp, and last field '124' which is the value I use to sum it to get the average value for the match string during that time frame. I have to use regex to match the line using 'match string; and also the timestamp because I want to get the lines between some time frame(like I want to read only the lines with timestamp between yesterday 6pm to today 4pm).
After I got the values of all the match string last field values then I will add it all last field values to get the average for that match string.
Assuming that -
(1) The string you want to match is "right_string".
(2) The range of timestamps is yesterday 6 pm to today 4 pm.

and given your data file that looks like this -

Code:
$
$ cat -n f4.txt
     1  <0309 14:47:39 def > /some_string_1/some_string_2|[right_string]|some_string_3||656
     2  <0309 17:29:45 def > /some_string_1/some_string_2|[right_string]|some_string_3||403
     3  <0309 18:50:42 def > /some_string_1/some_string_2|[right_string]|some_string_3||856
     4  <0309 19:53:14 def > /some_string_1/some_string_2|[wrong_string]|some_string_3||478
     5  <0309 21:52:59 def > /some_string_1/some_string_2|[right_string]|some_string_3||976
     6  <0309 23:27:11 def > /some_string_1/some_string_2|[right_string]|some_string_3||959
     7  <0310 09:28:27 def > /some_string_1/some_string_2|[wrong_string]|some_string_3||354
     8  <0310 11:31:36 def > /some_string_1/some_string_2|[right_string]|some_string_3||319
     9  <0310 13:40:38 def > /some_string_1/some_string_2|[right_string]|some_string_3||931
    10  <0310 16:11:42 def > /some_string_1/some_string_2|[right_string]|some_string_3||207
$
$

you can use a Perl script like the following that picks up the numbers at the far right and pushes them into an array. The average is calculated in the END section -

Code:
$
$
$ cat f4.txt
<0309 14:47:39 def > /some_string_1/some_string_2|[right_string]|some_string_3||656
<0309 17:29:45 def > /some_string_1/some_string_2|[right_string]|some_string_3||403
<0309 18:50:42 def > /some_string_1/some_string_2|[right_string]|some_string_3||856
<0309 19:53:14 def > /some_string_1/some_string_2|[wrong_string]|some_string_3||478
<0309 21:52:59 def > /some_string_1/some_string_2|[right_string]|some_string_3||976
<0309 23:27:11 def > /some_string_1/some_string_2|[right_string]|some_string_3||959
<0310 09:28:27 def > /some_string_1/some_string_2|[wrong_string]|some_string_3||354
<0310 11:31:36 def > /some_string_1/some_string_2|[right_string]|some_string_3||319
<0310 13:40:38 def > /some_string_1/some_string_2|[right_string]|some_string_3||931
<0310 16:11:42 def > /some_string_1/some_string_2|[right_string]|some_string_3||207
$
$
$ ##
$ perl -M"Date::Calc qw(Today Add_Delta_YMD Date_to_Time)" -ne '
>   BEGIN { ($y,$m,$d)=Today; @yday = Add_Delta_YMD($y,$m,$d, 0,0,-1);
>           $lbound = Date_to_Time(@yday, 18,0,0); $ubound = Date_to_Time($y,$m,$d, 16,0,0)}
>   if (/<(\d\d)(\d\d) (\d\d):(\d\d):(\d\d) .*?\[right_string\].*\|(.*?)$/) {
>     $dt = Date_to_Time($y,$1,$2, $3,$4,$5);
>     if ($lbound <= $dt and $dt <= $ubound) {push @x, $6; $sum += $6}
>   }
>   END {print "Array               = @x\n";
>        print "Sum of elements     = $sum\n";
>        print "Number of elements  = ",($#x+1),"\n";
>        print "Average of elements = ",($sum/($#x+1)),"\n"
>       }' f4.txt
Array               = 856 976 959 319 931
Sum of elements     = 4041
Number of elements  = 5
Average of elements = 808.2
$
$

Of course if your data is going to span multiple years, then the date on the left must include the year as well i.e. "03092010" instead of "0309". And the script will have to be modified so as to take care of that as well.

tyler_durden
# 4  
Old 03-14-2010
thanks tyler, its working. is it possible to get it in shell script, since i donot have access to perl.
# 5  
Old 03-14-2010
Quote:
Originally Posted by adars1
... is it possible to get it in shell script, ...
Possible ? - Yes.
Possible without aggravation ? - No, or maybe, depending on how hardcore the shell scripter is.

Date arithmetic isn't really that simple in the shell. Search this forum for Perderabo's excellent suite of date calculation functions for the shell.

Quote:
...since i donot have access to perl.
Talk to your SysAdmin. Perl has been around for almost 23 years now, and is shipped with most of the *nix systems.

tyler_durden
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

Regarding a GREAT site for understanding and Visualizing regex patterns.

Hello All, While googling on regex I came across a site named Regulex Regulex:JavaScript Regular Expression Visualizer I have written a simple regex ^(a|b|c)(*)@(.*) and could see its visualization; one could export it too, following is the screen shot. ... (3 Replies)
Discussion started by: RavinderSingh13
3 Replies

2. Shell Programming and Scripting

Regex patterns

can someone please confirm for me if i'm right: the pattern: ORA-0*(600?|7445|4) can someone give me an idea of all the entries the pattern above will grab from a database log file? is it looking for the following strings?: ORA-0600 ORA-7445 4) (2 Replies)
Discussion started by: SkySmart
2 Replies

3. Shell Programming and Scripting

Delete values between 2 patterns

Hi, How can i delete values between 2 patterns like below:- input.txt 192.1.1.2.22 blablabala 23.1.A.1.2 blablabalbl 5.4.1.1.12 blablaba i need to delete all values between starting from "." no 3 and second column. the output should be: 192.1.1 blablabala... (15 Replies)
Discussion started by: redse171
15 Replies

4. Shell Programming and Scripting

Grep from multiple patterns multiple file multiple output

Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep... (3 Replies)
Discussion started by: Diya123
3 Replies

5. Shell Programming and Scripting

Question about REGEX Patterns and Case Sensitivity?

Hello All, I'm in the middle of a script and I'm doing some checks with REGEX (i.e. using the '"shopt -s nocasematch" that at least the first one should print "FALSE" but it prints "TRUE"..? For Example: #!/bin/bash MY_VAR="HELLO" ### This prints "TRUE" PATTERN_1="^*" if ] then... (5 Replies)
Discussion started by: mrm5102
5 Replies

6. Shell Programming and Scripting

Capture query returned values in file.

Hi All, I am connecting to Oracle DB from UNIX script. Want to capture all dates between start date and end date and store them in file. Once this is done, want to read dates one by one. How to achive this in UNIX and Oracle? Please let me know if you have any idea on the same. Thanks and... (4 Replies)
Discussion started by: Nagaraja Akkiva
4 Replies

7. Shell Programming and Scripting

Search multiple patterns in multiple files

Hi, I have to write one script that has to search a list of numbers in certain zipped files. For eg. one file file1.txt contains the numbers. File1.txt contains 5,00,000 numbers and I have to search each number in zipped files(The number of zipped files are around 1000 each file is 5 MB) I have... (10 Replies)
Discussion started by: vsachan
10 Replies

8. Shell Programming and Scripting

How to capture C program return values in Kshell

I have a K shell script (ksh) that needs to return an email address. A C program was written (prog1) to now access the email address off of an oracle table. The call to the program in the ksh is prog1 -p parm1 Based on Parm1 the program will read an oracle table and retrieve the email... (2 Replies)
Discussion started by: jclanc8
2 Replies

9. Shell Programming and Scripting

how to capture oracle function returning 2 values in unix

i have an oracle function which returns two values, one is the error message if the function encounters anything and another one which returns a number i need to capture both and pass it on to unix shell script how to do it (2 Replies)
Discussion started by: trichyselva
2 Replies

10. Shell Programming and Scripting

Find multiple patterns on multiple lines and concatenate output

I'm trying to parse COBOL code to combine variables into one string. I have two variable names that get literals moved into them and I'd like to use sed, awk, or similar to find these lines and combine the variables into the final component. These variable names are always VAR1 and VAR2. For... (8 Replies)
Discussion started by: wilg0005
8 Replies
Login or Register to Ask a Question