Using AWK BEGIN to extract file header info into variables


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using AWK BEGIN to extract file header info into variables
# 1  
Old 05-27-2010
Using AWK BEGIN to extract file header info into variables

Hi Folks,

I've searched for this for quite a while, but can't find any solution - hope someone can help.

I have various files with standard headers. eg.

Code:
<HEADER>
IP: 1.2.3.4
Username: Joe
Time: 12:00:00
Date: 23/05/2010
</HEADER>

This
is
a
test
and this part can be any size
<END>

Now, I want to process and transpose this into:

Code:
IP=1.2.3.4 User=Joe Time=12:00:00 Line=This
IP=1.2.3.4 User=Joe Time=12:00:00 Line=is
IP=1.2.3.4 User=Joe Time=12:00:00 Line=a
IP=1.2.3.4 User=Joe Time=12:00:00 Line=test
IP=1.2.3.4 User=Joe Time=12:00:00 Line=and this part can be any size

I thought AWK would be the way to go - because of the begin statement. However, I can't find any info on whether it can:
- Read variables in from the file in the BEGIN section
- Read variables using, say, Regex or fixed-position (eg. $my_ip=someregexfunction("IP\: (\d+.\d+.\d+.\d+)")

Can anyone advise if this is possible? Or if not, is there another tool I can try?

Thanks!


Damian
# 2  
Old 05-27-2010
Quote:
Originally Posted by damoske
...is there another tool I can try?
...
Here's a Perl solution -

Code:
$ 
$ 
$ cat f2
<HEADER>
IP: 1.2.3.4
Username: Joe
Time: 12:00:00
Date: 23/05/2010
</HEADER>

This
is
a
test
and this part can be any size
<END>
$ 
$ 
$ perl -ne 'chomp;
  if (/<HEADER>/) {$in=1}
  elsif ($in and /^IP: (.*)$/) {$ip=$1}
  elsif ($in and /^Username: (.*)$/) {$user=$1}
  elsif ($in and /^Time: (.*)$/) {$time=$1}
  elsif (/<\/HEADER>/) {$in=0}
  elsif (!$in and !/^\s*$/ and !/<END>/) {print "IP=$ip User=$user Time=$time Line=$_\n"}' f2
IP=1.2.3.4 User=Joe Time=12:00:00 Line=This
IP=1.2.3.4 User=Joe Time=12:00:00 Line=is
IP=1.2.3.4 User=Joe Time=12:00:00 Line=a
IP=1.2.3.4 User=Joe Time=12:00:00 Line=test
IP=1.2.3.4 User=Joe Time=12:00:00 Line=and this part can be any size
$ 
$

tyler_durden
# 3  
Old 05-27-2010
Thanks! This is on a locked-down system, so I don't know if perl is available, or what flavour. I considered awk since I know they have it.

I actually got it started using something like:

Code:
BEGIN
{
 getline line1
 sub("IP: ","",line1)
 getline line2
 sub(......
 }

{ printf ("ip=%s Username=%s %s",line1,line2, $0) }

BUT - the output seems corrupted, with the leading variables sometimes not there, and sometimes overlaid on top of the characters. Really wierd. I wonder if such use of variables and $0 isn't supported...

This is on CENTOS right now, but I have to do it various commercial *nix.


Damian
# 4  
Old 05-27-2010
I'm pretty sure it can be done with awk, though I don't know how Smilie
This is my indeed "amateur" approach with standard tools, maybe it can help you with translating it to awk?
Code:
$ string=$(grep -E '^IP|^User|^Time' infile | sed 's/Username/User/;s/: /=/' | tr "\n" " ")
$ sed -n '/<\/HEADER>/,/<END>/p' infile | sed 's/<\/HEADER>//;s/<END>//;/^$/d;s/^/Line=/' | \
> while read line; do echo "$string$line"; done
IP=1.2.3.4 User=Joe Time=12:00:00 Line=This
IP=1.2.3.4 User=Joe Time=12:00:00 Line=is
IP=1.2.3.4 User=Joe Time=12:00:00 Line=a
IP=1.2.3.4 User=Joe Time=12:00:00 Line=test
IP=1.2.3.4 User=Joe Time=12:00:00 Line=and this part can be any size
$

# 5  
Old 05-28-2010
Quote:
Originally Posted by damoske
...This is on a locked-down system, so I don't know if perl is available, or what flavour. I considered awk since I know they have it. ...
Well, you could apply the logic to awk in that case -

Code:
$ 
$ 
$ cat f2
<HEADER>
IP: 1.2.3.4
Username: Joe
Time: 12:00:00
Date: 23/05/2010
</HEADER>

This
is
a
test
and this part can be any size
<END>
$ 
$ 
$ awk '{
  if (/<HEADER>/) {x=1}
  else if (x==1 && /^IP/) {sub("IP: ","",$0); ip=$0}
  else if (x==1 && /^Username/) {sub("Username: ","",$0); user=$0}
  else if (x==1 && /^Time/) {sub("Time: ","",$0); time=$0}
  else if (/<\/HEADER>/) {x=0}
  else if (x==0 && !/<END>/ && !/^ *$/) {print "IP="ip" User="user" Time="time" Line="$0}
}' f2
IP=1.2.3.4 User=Joe Time=12:00:00 Line=This
IP=1.2.3.4 User=Joe Time=12:00:00 Line=is
IP=1.2.3.4 User=Joe Time=12:00:00 Line=a
IP=1.2.3.4 User=Joe Time=12:00:00 Line=test
IP=1.2.3.4 User=Joe Time=12:00:00 Line=and this part can be any size
$ 
$

I know that looks kludgy and I do hope the excellent awk scripters on this forum would come up with a more polished and elegant script.

tyler_durden
# 6  
Old 05-28-2010
Awk:

Code:
awk -F ": " '/<HEADER>/,/<\/HEADER>/{
if($0 ~ /IP/){ip=$2}
if ($0 ~ /Username/){use=$2}
if ($0 ~ /Time/){time=$2};next
}
NF =1 && !/END/ && !/^$/{print "IP=",ip,"User=",use,"Time=",time,$0}' filename


cheers,
Devaraj Takhellambam
# 7  
Old 06-14-2010
A belated sincere thanks for these guys; I was on 'radio silence' for a couple of weeks, but I'll give these a go.


Damian
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk script to extract a column, replace one of the header and replace year(from ddmmyy to yyyy)

I have a csv which has lot of columns . I was looking for an awk script which would extract a column twice. for the first occurance the header and data needs to be intact but for the second occurance i want to replace the header name since it a duplicate and extract year value which is in ddmmyy... (10 Replies)
Discussion started by: Kunalcurious
10 Replies

2. Shell Programming and Scripting

Help with awk to extract additional info

Hi I use multipath linux command to get LUNs info and find out if any failed. # multipath -ll >/tmp/mpfail # cat /tmp/mpfail multipath.conf line 109, invalid keyword: user_friendly_names multipath.conf line 153, invalid keyword: user_friendly_names multipath.conf line 193, invalid... (4 Replies)
Discussion started by: prvnrk
4 Replies

3. Shell Programming and Scripting

Extract info and do algebra on it by sed or awk

Hello everyone, I need to extract some information from a csv file and further need to do some algebraic calculations on those information and then to throw the result in a new file. Here is a sample from my data.csv file; Col1,Col2,Col3,Col4,Col5,Col6,Col7... (19 Replies)
Discussion started by: hayreter
19 Replies

4. UNIX for Advanced & Expert Users

How to extract info from text file between the tags

Hi, I have a text file with member information... B]Name is in H1 tag Title is in H2 tag Email is in <a id="ctl00_ContentPlaceHolder3_repeaterItems_ctl01_lbnEmailMe" href="javascript:__doPostBack('ctl00$ContentPlaceHolder3$repeaterItems$ctl01$lbnEmailMe','')">someone@company.com</a> Location:... (6 Replies)
Discussion started by: igurv
6 Replies

5. Shell Programming and Scripting

How to extract the day of the year and use that info to copy a file remotely

Hello, Thank you in advance for helping a newbie who is having great trouble with this simple task. I'm allowed to copy one file remotely each night due to bandwidth restrictions. A new file gets generated once a day, and I need to copy the previous day's file. Here is what I'd like to do:... (1 Reply)
Discussion started by: tmozdzen
1 Replies

6. UNIX for Dummies Questions & Answers

Extract consecutive lines that begin with a character

Hello, From a sample file below, I would like to extract only consecutive lines that begin with a '$'. How can I do this? $ABC.1 XYGHGHGHHG $ABC.2 RSTUVBWBB $ABC.3 87908787798798 $QRS.5 $RST.6 679707097 $LmN.4 hgkhgh $QRS.5 $ABC.9 Thanks in advance for your help! (6 Replies)
Discussion started by: Gussifinknottle
6 Replies

7. Shell Programming and Scripting

how to extract the info in the tag from a xml file

Hi All, Do anyone of you have any idea how to extract each<info> tag to each different file. I have 1000 raw files, which come in every 15 mins.( I am using bash) I have tried my script as below, but it took hours to finish, which is inefficiency. perl -n -e '/^<info>/ and open FH,">file".$n++;... (2 Replies)
Discussion started by: natalie23
2 Replies

8. Shell Programming and Scripting

Extract info from log file and compute using time date stamp

Looking for a shell script or a simple perl script . I am new to scripting and not very good at it . I have 2 directories . One of them holds a text file with list of files in it and the second one is a daily log which shows the file completion time. I need to co-relate both and make a report. ... (0 Replies)
Discussion started by: breez_drew
0 Replies

9. Shell Programming and Scripting

Extract date from file header and prefix it to all lines

Hello All, I have a file in the following format. I want to extract the date(020090930, 020090929) in the string "STPAGE020090930" and "STPAGE020090929" and prefix it to all lines below them. The output must be put into a new file. STPAGE020090930 xyzz aalc... (3 Replies)
Discussion started by: john2022
3 Replies

10. Shell Programming and Scripting

how to extract info from a file using awk

Dear all I have a file call interfaces.txt Filename: interfaces.txt How can I extract the information at below? ABC_DB_001 hostname1 20901 ABC_DB_002 hostname2 20903 ABC_DB_003 hostname3 20905 Currently I am using a very stupid method grep ^ABC interfaces.txt > name.txt grep... (3 Replies)
Discussion started by: on9west
3 Replies
Login or Register to Ask a Question