Seperate complicated fields with awk

01-27-2009

Registered User

27, 0

Join Date: Dec 2008

Last Activity: 8 October 2014, 7:04 AM EDT

Location: Germany - Hamburg

Posts: 27

Thanks Given: 3

Thanked 0 Times in 0 Posts

Seperate complicated fields with awk

Hello, I want to separate fields from an log output like this:

11-JUL-2008 23:14:25 * (CONNECT_DATA=(SERVICE_NAME=WUMMER.IM.HERE.EXELLENT.COM)(CID=(PROGRAM=D:\oracle\product\10.2.0\clien t_1\jdk\jre\bin\java.exe)(HOST=X900005199)(USER=FTET1))) * (ADDRESS=(PROTOCOL=tcp)(HOST=45.137.251.223)(PORT=2196)) * establish * WUMMER.IM.HERE.EXELLENT.COM * 0
11-JUL-2008 23:20:20 * (CONNECT_DATA=(SID=P1VPMHAM)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=133.52.24.148)(PORT=1462)) * establish * WUMMER * 0

into:

$1 = 11-JUL-2008 23:14:25
$2 = (CONNECT_DATA=(SERVICE_NAME=WUMMER.IM.HERE.EXELLENT.COM)
$3= (CID=(PROGRAM=D:\oracle\product\10.2.0\client_1\jdk\jre\bin\java.exe)
$4= (HOST=X900005199)
$5= (USER=FTET1)
$6= (ADDRESS=(PROTOCOL=tcp)
$7= (HOST=45.137.251.223)
$8= (PORT=2196)

I've tried to play with the FS seperator with mixed results:
awk -F'(*[^(]*)' '{ print $1 " " $2 " " $3 }' listener.log

Anyone an idea for me, I think i need the correct regular expression.

sdohn

View Public Profile for sdohn

Find all posts by sdohn

01-27-2009

Registered User

2,524, 241

Join Date: Dec 2007

Last Activity: 17 March 2020, 2:04 PM EDT

Posts: 2,524

Thanks Given: 173

Thanked 241 Times in 206 Posts

This may start you off...

What I did was replace any ( with ~( so I could use the ~ as a delimiter.

> cat file149
11-JUL-2008 23:14:25 * (CONNECT_DATA=(SERVICE_NAME=WUMMER.IM.HERE.EXELLENT.COM)(CID=(PROGRAM=D:\oracle\product\10.2.0\clien t_1\jdk\jre\bin\java.exe)(HOST=X900005199)(USER=FTET1))) * (ADDRESS=(PROTOCOL=tcp)(HOST=45.137.251.223)(PORT=2196)) * establish * WUMMER.IM.HERE.EXELLENT.COM * 0
11-JUL-2008 23:20:20 * (CONNECT_DATA=(SID=P1VPMHAM)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=133.52.24.148)(PORT=1462)) * establish * WUMMER * 0

Code:

> sed "s/(/~(/g" <file149 >file149.a
> awk -F"~" '{print "1="$1,"\n2="$2$3,"\n3="$4$5,"\n4="$6,"\n5="$7"\n"}' file149.a
1=11-JUL-2008 23:14:25 *  
2=(CONNECT_DATA=(SERVICE_NAME=WUMMER.IM.HERE.EXELLENT.COM) 
3=(CID=(PROGRAM=D:\oracle\product\10.2.0\clien t_1\jdk\jre\bin\java.exe) 
4=(HOST=X900005199) 
5=(USER=FTET1))) * 

1=11-JUL-2008 23:20:20 *  
2=(CONNECT_DATA=(SID=P1VPMHAM) 
3=(CID=(PROGRAM=) 
4=(HOST=__jdbc__) 
5=(USER=))) *

joeyg

View Public Profile for joeyg

Find all posts by joeyg

01-27-2009

Registered User

486, 10

Join Date: Jan 2009

Last Activity: 9 September 2015, 11:46 AM EDT

Location: canton, michigan

Posts: 486

Thanks Given: 0

Thanked 10 Times in 10 Posts

an inelegant solution

Forget regular expressions. That isn't going to happen.
What you should probably do... is explain what you eventually want to do with
the variables. My initial questions are:
why awk?
why do they have to be in positions $1 through $8?
Once there, what do you want to do with them?
My point is -- the end result is what you're after -- hopefully -- not
whether we can put them in positions 1 through 8 for awk to do something with.
However, taking this nasty log file and converting it to your whims, like so:

cat << EOF |
11-JUL-2008 23:14:25 * (CONNECT_DATA=(SERVICE_NAME=WUMMER.IM.HERE.EXELLENT.COM)(CID=(PROGRAM=D:\oracle\product\10.2.0\clien t_1\jdk\jre\bin\java.exe)(HOST=X900005199)(USER=FTET1))) * (ADDRESS=(PROTOCOL=tcp)(HOST=45.137.251.223)(PORT=2196)) * establish * WUMMER.IM.HERE.EXELLENT.COM * 0
11-JUL-2008 23:20:20 * (CONNECT_DATA=(SID=P1VPMHAM)(CID=(PROGRAM=)(HOST=__jdbc__)(USER=))) * (ADDRESS=(PROTOCOL=tcp)(HOST=133.52.24.148)(PORT=1462)) * establish * WUMMER * 0
EOF
###---------------------------------------
### retain space for date, removed later on
###---------------------------------------
sed -e 's/ /@/' \
-e 's/)/) /g' \
|
###---------------------------------------
### convert all spaces to newlines
###---------------------------------------
tr ' ' '\012' |
###---------------------------------------
### delete blank lines, asterisk only lines and parenthise only lines
###---------------------------------------
sed -e '/^$/d' \
-e '/^\*/d' \
-e '/^)$/d' \
|
###---------------------------------------
### some line numbering...
###---------------------------------------
nl -nln |
###---------------------------------------
### grab only the 1-8 "fields"
###---------------------------------------
grep '^[1-8] ' |
###---------------------------------------
### convert to one line
###---------------------------------------
while read num line; do
print -n "$line "
if [ $num -eq 8 ]; then
print
fi
done |
###---------------------------------------
### and there they are... in positions 1-8
###---------------------------------------
awk 'BEGIN{ OFS="|"; }
{ print( $1, $2, $3, $4, $5, $6, $7, $8 ); }' |
###---------------------------------------
### oh. and remove the at sign for the date.
###---------------------------------------
sed -e 's/@/ /'

It's a complex mess, indeed.

Last edited by quirkasaurus; 01-27-2009 at 11:52 AM..

quirkasaurus

View Public Profile for quirkasaurus

Find all posts by quirkasaurus

01-27-2009

Registered User

27, 0

Join Date: Dec 2008

Last Activity: 8 October 2014, 7:04 AM EDT

Location: Germany - Hamburg

Posts: 27

Thanks Given: 3

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by joeyg

What I did was replace any ( with ~( so I could use the ~ as a delimiter.

Thanks a lot User joeyg for your solution, now I can further remove what I'm not wanting on the lines.

brgds from User sdohn

sdohn

View Public Profile for sdohn

Find all posts by sdohn

01-27-2009

Registered User

486, 10

Join Date: Jan 2009

Last Activity: 9 September 2015, 11:46 AM EDT

Location: canton, michigan

Posts: 486

Thanks Given: 0

Thanked 10 Times in 10 Posts

i like the tilde solution, too. even better!

but figured i'd post mine anyways -- hopefully some of the ideas are valuable...

quirkasaurus

View Public Profile for quirkasaurus

Find all posts by quirkasaurus

01-27-2009

Registered User

27, 0

Join Date: Dec 2008

Last Activity: 8 October 2014, 7:04 AM EDT

Location: Germany - Hamburg

Posts: 27

Thanks Given: 3

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by quirkasaurus

Thank you for your solution to this complex problem.
The reason for me was to seperate the Values for putting them in a database. Now I can do a report with sql with the data.

brgds from user sdohn

sdohn

View Public Profile for sdohn

Find all posts by sdohn

01-27-2009

Registered User

486, 10

Join Date: Jan 2009

Last Activity: 9 September 2015, 11:46 AM EDT

Location: canton, michigan

Posts: 486

Thanks Given: 0

Thanked 10 Times in 10 Posts

Cool. Then the script is useful. It converts everything to a pipe-delimited output.
Just load from there.

quirkasaurus

View Public Profile for quirkasaurus

Find all posts by quirkasaurus

Shell Programming and Scripting

Seperate complicated fields with awk

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to filter file based on seperate conditions

Discussion started by: cmccabe

2. Shell Programming and Scripting

awk sort based on difference of fields and print all fields

Discussion started by: newstart

3. Shell Programming and Scripting

awk - compare 1st 15 fields of record with 20 fields

Discussion started by: sljnk

4. Shell Programming and Scripting

How to print 1st field and last 2 fields together and the rest of the fields after it using awk?

Discussion started by: 100bees

5. Shell Programming and Scripting

Join fields comparing 4 fields using awk

Discussion started by: aksijain

6. Shell Programming and Scripting

Awk and duplicate lines - little complicated

Discussion started by: shadowww

7. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

Discussion started by: axo959

8. UNIX for Dummies Questions & Answers

awk to seperate a string that has a dash

Discussion started by: melanie_pfefer

9. Shell Programming and Scripting

how to awk a data from seperate lines

Discussion started by: Mish_99

10. Shell Programming and Scripting

Sort complicated two fields

Discussion started by: lalelle