This pattern is shown in the example of 6 lines from the log in the code box below. These 6 lines are in a row, and all start with same IP address. These are actually the first 6 lines of about 20-25 lines, in which a document was served with multiple gif images.
My purpose is to delete lines under the following conditions:
First, determine whether a line in the log becomes a "reference line"
This happens if the line being tested ("subsequent line")
has a different IP address field from IP address field of the reference line or if there is no reference line currently to evaluate (uninitialized reference line)
has the same IP address to the IP address field in the reference line, but the referring document is not from my web site
has the same IP address to the IP address in the reference line but the requested (GET) document is from my web site but a different HTML document
has the same IP address to the IP address in the current reference line but the requested document is logged more than 60 seconds after the previous requested document in the current reference line
Note that a "subsequent line" is one that does not qualify to become a "reference line."
I have only one sed command line now, basically the regular expression to correspond to the pattern which identifies the line and parenthesized expressions/fields in the line. To be safe, I am using old-style regular expression syntax and not any "extended" kind, such as using `\d` metacharacters to indicate digits
I have wrapped three fields: the IP address, Date, requested document (between GET and HTTP/1.1), in parentheses.
This becomes the line to be tested: the tests are to do as above. 1) check for IP address difference, (2) check for time difference, (3) examine the requested docfield for file types ( gif | ico | css | js | png | jpg | jpeg | etc ) basically if they are not html, they get deleted (4) make sure the hostname/server name in the referring document is 'http://my.website.org/'
I am thinking that I need to use the Hold and eXchange pattern system, but am not sure how to go about that. More importantly, I must do comparisons on the text, converting date/time expressions into integers to be compared, and more doing string comparisons. The sed utility, as far as I know, has no built-in features for this, so I may have to pass these as parameters to a shell (?) to do the comparisons and return a result that sed can work with.
I have even more of a challenge too: see the NB below.
What I need is a good pointer or reference to what I should be telling sed to do, aside from just being given the answer. Thanks.
(to respect privacy of those accessing the server, I changed IP address to a recognized private LAN address [I hope])
NB: I am running this sed script as sed.exe (GNU sed version 4.2.1 (c) 2009) under Microsoft Windows 7, thus solutions requiring use of a shell should be a shell command processor installed (MS cmd version 6..7601 or or installable within Windos 7. I am aware that I can process the text of the logs within a VM running a Linux distro (I have, for instance, Ubuntu and TinyCore Linux installed as VMs), but (1) I have not kicked the MS Windows environment as an every-day use system and (2) my facility in bash scripting was more than a decade ago.
Last edited by Proteomist; 10-01-2012 at 03:03 AM..
Reason: break up code-boxed line lengths
Sed scripts that handle multiple lines usually have a different flavor -- I like to call them loopers.
You add more lines using N. Often, the only line not read with N is the first! The behavior of N at $ (eof) was buggy in some early versions, so I test for that before the N.
Then you can write regex that span or hook to the '\n' in between lines that also still matches '.'.
Using :labels and t or b branching, you can pile up lines in the buffrer to your heart's content (or your old sed version's fixed buffer size).
You can use P to spit out just the first line.
With s and \(\) and \1 \2 ... you can swap lines around.
Not much use for D, since you start over.
The '\n' does not seem to be something you can put in [ ... ].
Hi all,
Let's say I have a script calling for the two variables PA_VALUE and PB_VALUE.
for pa in PA_VALUE
blah blah
do
for pb in PB_VALUE
blah blah
do
I have a text file with two columns of values for PA and PB.
14.5 16.7
7.8 9.5
5.6 3.6
etc etc
I would like to read this... (7 Replies)
I need to read the contents of a file. Then I need to grep for a keyword and replace part of the grepped line based on the condition of previous and present line.
Example input file:
V {
port1 = P;
port2 = 0;
shift_port = P0; /* if next shift_port is P0 I need... (9 Replies)
Hi,
I am currently trying to work out how to compare one line with the last line I have read in via ksh. I have a file which has sorted output from a previous sort command so all the lines are in order already and the file would look something like show below. Each line has a name and a time... (5 Replies)
Hi guys,
I have found your forum super useful. However, right now I am stuck on a seemingly "simple" thing in AWK. I have two columns of data, the first column in Age (in million years) and the second column is Convergence Rate (in mm/yr).
I am trying to process my data so I can use it to... (2 Replies)
Hello,
Can anybody help me to correct my sed syntax to find the string and print previous two lines and current line and next one line.
i am using string as "testing"
netstat -v | sed -n -e '/test/{x;2!p;g;$!N;p;D;}' -e h
i am able to get the previous line current line next line but... (1 Reply)
I know this has been asked before but I just can't parse the syntax as explained. I have a set of files that has user information spread out over two lines that I wish to merge into one:
User1NameLast User1NameFirst User1Address
E-Mail:User1email
User2NameLast User2NameFirst User2Address... (11 Replies)
Hi,
How can I remove the line beak in the following case if the line begin with the special char “;”?
TEXT
Text;text
;text
Text;text;text
I want to convert the text to:
Text;text;text
Text;text;text
I have already tried to use... (31 Replies)
Hi all,
I have to append every alternate line after its previous line. For example if my file has following contents
line 1: unix is an OS
line 2: it is open source
line 3: it supports shell programming
line 4: we can write shell scripts
Required output should be
line1: unix is an OS it is... (4 Replies)
Hi,
how can i print the previous, current and next line using sed?
current line is the matching line.
The following prints all lines containing 'Failure' and also the immediate next line
cat $file | sed -n -e '/Failure/{N;p;}'
Now, i also want to print the previous line too.
Thanks,... (8 Replies)