Passing Shell Input to AWK

Passing Shell Input to AWK

I am trying to search a log for a particluar pattern listing the total # of occurences in the end.

I thought using a shell script for input then calling awk to search for the paramters specified. I want the script to be usable acorss envs.

#! /usr/bin/bash
# get the variables
echo -n "1.  LogFILE ?"
read LOG
echo -n "2. SEARCH FOR ? "
read word
echo -n "3. Timestamp start = "
read time_st
echo -n "4. Timestamp end = "
read time_en
# search the file
awk  -v"1=${word}" "2=${time_st}" "3=${time_en}" "4=${log}" | wc

Source: Items in read is what I would like to search on.
Code: - - [22/Sep/2009:10:02:24 -0400] "GET /portal/framework/skins/cafe/css/body.css HTTP/1.1" 200 1304 - - [22/Sep/2009:14:06:32 -0400] "GET /portal/cafekeepalive.jsp?NONE HTTP/1.1" 200 21

"" 12 lines, 287 characters
bash-3.00$  ./
1.  LogFILE ?log.log
2. SEARCH FOR ? cafe
3. Timestamp start = 10:0*
4. Timestamp end = 14:0*
awk: syntax error near line 1
awk: bailing out near line 1

Thanks in advance for any assitance

not sure awk is the right tool here.

you can 'grep -n' to get the line number of each $time_st and $time_en
you can use a combination of 'head' and 'tail' to output the subset of lines between $time_st and $time_en you want to search
you can use 'grep -c' to get the number of occurances of $word in that subset

DV -
Thanks for the response...

Am heading down the right direction with this?

cat access.log.1253664000 | egrep "" | egrep "10:0*" | egrep "11:*" head -1


Originally Posted by wawa44oz
DV -
Thanks for the response...

Am heading down the right direction with this?

cat access.log.1253664000 | egrep "" | egrep "10:0*" | egrep "11:*" head -1

You don't need to use cat here.

Your greps need some escaping. Right now the . means 'match any character', not 'match .'

And your greps won't return anything because first you reject everything except your start time, and after that you reject everything except your end time. You need to match start and end and everything inbetween. Since grep doesn't understand what date and time means -- or what digits mean, for that matter -- I don't think grep can do what you want all by itself. There's several things with : in it that could match anyway, so that probably won't narrow it down to what you want. I'll work on this a bit...

---------- Post updated at 12:38 PM ---------- Previous update was at 12:25 PM ----------

Here's how you'd match exactly the hours you want.

# Build a list of the hours we want, to fill into egrep
for ((N=START+1; N<=END; N++))

# This will match 14:06 but not 10:02 since it starts at 11
echo -e "[22/Sep/2009:10:02:24 -0400]\n[22/Sep/2009:14:06:32 -0400]" |
        egrep "/[0-9]+:($STR):"

You could tack another grep to match a specific hostname or what have you. It's not too sophisticated. You can't take it too much farther because grep can't understand what the dates actually mean.

This is about as complex as I'd bother making it in a shell script, since shell in general has a hard time processing date information. If you want something smart enough to just specify a beginning time and date and end time and date, I'd just use perl and process the dates wholesale to compare them.
ok, so in my orig response I was thinking this:

# the number of the first line in the file with the start time and word
STARTLINE=grep -n ":$start_time:[0-5][0-9]:.*$word"|head -1|cut -f1 -d':'

# the number of the last line in the file with the end time and word
ENDLINE=grep -n ":$end_time:[0-5][0-9]:.*$word"|tail -1|cut -f1 -d':'

# the subset of lines between STARTLINE and ENDLINE
head -$ENDLINE file | tail -n +$STARTLINE > subsetfile

# the number of lines in subsetfile containing $word
grep -c $word subsetfile

regexes here will work, but could be improved.
