I wasn't quite sure how to title this one! Here goes:
I have some already partially parsed log files, which I now need to extract info from. Because of the way they are originally and the fact they have been partially processed already, I can't make any assumptions on the number of fields and the exact format etc. All I know is I can look for certain patterns. An extract of the original source is:
Code:
Job <1>, Job Name <BLAH>, Queue-- MEMLIMIT 10 G Fri Oct 11 09:55:48: Started on <cn035>, -- The CPU time is 12 seconds. MEM: 1 Gbytes;
Job <2>, Job Name <BLAH>, Queue-- MEMLIMIT 10 G Fri Oct 11 09:55:48: Started on <cn069>, -- The CPU time is 10 seconds. MEM: 1 Gbytes;
Job <3>, Job Name <BLAH>, MEMLIMIT 10 G Fri Oct 11 09:55:48: Started on <cn049>, ;-- The CPU time is 13 seconds. MEM: 2 Gbytes;
Job <4>, Job Name <BLAH>, Status <RUN>, Command <-- The CPU time is 76 seconds. MEM: 3 Gbytes;
Job <7>, Job Name <BLAH>, Stat us <RUN>, Command <-- The CPU time is 49 seconds. MEM: 1014 Mbytes;
Job <8>, Job Name <BLAH> , Status <RUN>, -- MEMLIMIT 10 G Fri Oct 11 22:13:19: Started on <cn014>;-- The CPU time is 12 seconds. MEM: 391 Mbytes;
Job <9>, Job Name <BLAH>, Status <RUN >, Command <: Started on <cn026>,-- The CPU time is 71 seconds. MEM: 13 Mbytes;
Job <10>, Job Name <BLAH>, Sta tus <RUN>, Command <#!/bi-- MEMLIMIT 22 G Started on <cn064>, -- The CPU time is 25 seconds. MEM: 12 Gbytes;
I want to extract based on:
Code:
Started on <____>,
MEMLIMIT __ G
MEM: ___ bytes;
The first line example being:
Code:
MEMLIMIT 10 G Fri Oct 11 09:55:48: Started on <cn035>, -- The CPU time is 12 seconds. MEM: 1 Gbytes;
Each line may contain all, some or none of the above. My ideal output based on the above would be something like:
Code:
Started: cn035 MEMLIMIT: 10 G MEM: 1 G
Started: cn069 MEMLIMIT: 10 G MEM: 1 G
etc
etc
(ideally, if there is no MEMLIMIT found on a line for example):
Started: cn026 MEMLIMIT: 0 G MEM: 13 M
I've messed around with gsub in awk to extract a single instance but couldn't work out how to select on multiple patterns...
Any help as always would be appreciated!
Last edited by Scrutinizer; 10-13-2013 at 06:38 AM..
Reason: additional code tags
Thanks for that Scrutinizer - so very close to what I need! If I've got it correct, it only displays if all three patterns are found, ideally it would be great if it could print every line with 1 or more matches:
Code:
Started: cn026 MEMLIMIT: 0 G MEM: 13 M
or just blank rather than 0 G on the MEMLIMIT. Basically every entry _should_ have a 'Started on' and a MEM:, but not necessarily a MEMLIMIT
Last edited by chrissycc; 10-13-2013 at 07:46 AM..
Reason: correction
If you are OK with Perl solution: put this into "script.pl":
Code:
#!/usr/bin/perl
use strict;
open I, "$ARGV[0]";
while (chomp($_=<I>)) {
if (/Started on <([^>]+)/) {
my $started=$1;
my $memlimit=$1 if /MEMLIMIT (\d+) G/;
$memlimit=$memlimit?$memlimit:0;
/MEM: ([^;]+)/;
my $mem=$1;
print "Started: $started MEMLIMIT: $memlimit G MEM: $mem\n";
}
}
As RudiC pointed out, the following only works on Solaris:
Code:
/usr/xpg4/bin/awk '{
started=$0; if (!sub(".*Started on <([^>]*).*","\1",started)) started="-"
memlimit=$0; if (!sub(".*MEMLIMIT ([^ ]* [^ ;]*).*","\1",memlimit)) memlimit="-"
mem=$0; if (!sub(".*MEM: ([^ ]* [^ ;]*).*","\1",mem)) mem="-"
printf "Started on: %-8s MEMLIMIT: %-8s MEM: %-8s\n",started,memlimit,mem
}' file
Started on: cn035 MEMLIMIT: 10 G MEM: 1 Gbytes
Started on: cn069 MEMLIMIT: 10 G MEM: 1 Gbytes
Started on: cn049 MEMLIMIT: 10 G MEM: 2 Gbytes
Started on: - MEMLIMIT: - MEM: 3 Gbytes
Started on: - MEMLIMIT: - MEM: 1014 Mbytes
Started on: cn014 MEMLIMIT: 10 G MEM: 391 Mbytes
Started on: cn026 MEMLIMIT: - MEM: 13 Mbytes
Started on: cn064 MEMLIMIT: 22 G MEM: 12 Gbytes
Last edited by MadeInGermany; 10-13-2013 at 01:47 PM..
Hi,
I need help to match patterns from between two different files and extract region of strings.
inputfile1.fa
>l-WR24-1:1
GCCGGCGTCGCGGTTGCTCGCGCTCTGGGCGCTGGCGGCTGTGGCTCTACCCGGCTCCGG
GGCGGAGGGCGACGGCGGGTGGTGAGCGGCCCGGGAGGGGCCGGGCGGTGGGGTCACGTG... (4 Replies)
Hi,
I have multiple files in my log folder. e.g:
a_m1.log
b_1.log
c_1.log
d_1.log
b_2.log
c_2.log
d_2.log
e_m1.log
a_m2.log
e_m2.log
I need to keep latest 10 instances of each file.
I can write multiple find commands but looking if it is possible in one line.
m file are monthly... (4 Replies)
Hello
I have an output that has a string between quotes and another between square brackets on the same line. I need to extract these 2 strings Example line
Device "nrst3a" attributes=(0x4) RAW SERIAL_NUMBER=SNL2
Output should look like
nrst3a VD073AV1443BVW00083
I was trying with sed... (3 Replies)
I need to extract multiple occurance strings between 2 different patterns in given line.
For e.g. in below as input
-------------------------------------------------------------------------------------
mike(hussey) AND mike(donald) AND mike(ryan) AND mike(johnson)... (8 Replies)
Hi Folks,
I have two arrays
a:
aaa bbb ccc ddd
ddd aaa bbb ccc
ddd ccc aaa bbb
b:
aaa bbb ccc
aaa ccc bbb
bbb aaa ccc
ccc bbb aaa
I want to compare row by row a(c1:c4) to b(c1:c3). If elements of 'b' match... (5 Replies)
I am trying to extract multiple strings from snmp-mib files like below.
-----
$ cat IF-MIB.mib
<snip>
linkDown NOTIFICATION-TYPE
OBJECTS { ifIndex, ifAdminStatus, ifOperStatus }
STATUS current
DESCRIPTION
"A linkDown trap signifies that the SNMP entity, acting in... (5 Replies)
I have the following in an awk script. I want to do them on condition that: fext == "xt"
FNR == NR {
/>/ && idx = ++i
$2 || val = $1
next
}
FNR in idx { v = val] }
{ !/>/ && srdist = abs($1 - v) }
/>/ || NF == 2 && srdist < dsrmx {... (1 Reply)
In a directory, there are two different file extensions (*.txt and *.xyz) having similar names of numerical strings (*). The (*.txt) contains 5000 multiple files and the (*.xyz) also contains 5000 multiple files. Each of the files has around 4000 rows and 8 columns, with several unique string... (5 Replies)