The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. Shell Script Page.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
To extract everything between two delimiters dowsed4u8 SUN Solaris 1 01-16-2008 10:49 AM
parse multiple lines? should be a easy answer... DeuceLee UNIX for Dummies Questions & Answers 4 01-04-2008 12:54 PM
Delimiters missing Indalecio Shell Programming and Scripting 2 02-23-2007 12:28 AM
Cut based on Two Delimiters at one go pbsrinivas Shell Programming and Scripting 4 01-18-2007 12:45 AM
awk - treat multiple delimiters as one peter.herlihy Shell Programming and Scripting 6 08-30-2002 09:12 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: May 2008
Posts: 4
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
parse of lines with different delimiters

Hi,

I am having huge file with the following lines.

2007:10:01:00:00:49:GMT: subject=BMRA.BM.T_ABTH7.FPN, message={SD=2007:10:01:00:00:00:GMT,SP=5,NP=2,TS=2007:10:01:01:00:00:GMT,VP=0.0,TS=2007:10:01:01:30: 00:GMT,VP=0.0}
2007:10:01:00:00:49:GMT: subject=BMRA.BM.T_ABTH7G.FPN, message={SD=2007:10:01:00:00:00:GMT,SP=5,NP=2,TS=2007:10:01:01:00:00:GMT,VP=0.0,TS=2007:10:01:01:30: 00:GMT,VP=0.0}


I need to parse them into the following format.

2007-10-01,T_ABTH7,2007-10-0100:00:00,5,0.0
2007-10-01,T_ABTH7G,2007-10-0100:00:00,5,0.0

Is there a way to parse the entire file without reading a single line of file and formating the output.

Thanks in advance.
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: Apr 2008
Location: Bangalore
Posts: 120
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
there are so many occurences of 2 and 0.0 in the input lines.

Highlight the portions of input line that u want to report


Thanks
Penchal
Reply With Quote
  #3 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: May 2008
Posts: 4
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Hi Penchal,

I want to parse the highlighted values.

2007:10:01:00:00:49:GMT: subject=BMRA.BM.T_ABTH7.FPN, message={SD=2007:10:01:00:00:00:GMT,SP=5,NP=2,TS=2007:10:01:01:00:00:GMT,VP=0.0,TS=2007:10:01:01:30: 00:GMT,VP=0.0}

First output column (2007-10-01): SD=2007:10:01:00:00:00

Second column (T_ABTH7): subject=BMRA.BM.T_ABTH7.FPN

Third Column (2007-10-0100:00:00): TS=2007:10:01:01:00:00

Fourth Column (5): SP=5

Fifth Column(0.0) : VP=0.0

Output for a single line :
2007-10-01,T_ABTH7,2007-10-0100:00:00,5,0.0

Please let me know if this clear.
Reply With Quote
  #4 (permalink)  
Old 05-16-2008
Moderator
 

Join Date: Dec 2003
Location: /ksh93
Posts: 777
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
One way is to use the pattern matching and substitution capabilities of ksh93 or bash
Code:
#!/usr/bin/ksh93

IFS=','
while read c1 c2 c3 c4 c5 c6 rest
do
   tmp1=${c2##*SD=}
   tmp1=${tmp1%%:00:00:00:GMT}
   tmp2=${c1##*=BMRA.BM.}
   tmp3=${c5##TS=}
   tmp3=${tmp3%%:GMT}
   tmp3A=${tmp3:0:10}
   printf "%s,%s,%s%s,%s,%s\n" ${tmp1//:/-} ${tmp2%%.FPN} ${tmp3A//:/-} ${tmp3:11} ${c3##SP=} ${c6##VP=}
done < file
Output:
Code:
2007-10-01,T_ABTH7,2007-10-0101:00:00,5,0.0
2007-10-01,T_ABTH7G,2007-10-0101:00:00,5,0.0
Reply With Quote
  #5 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: May 2008
Posts: 4
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Hi Murphy,

Thanks for the reply.

I am getting the following error message while runnig the script.

tmp3A=${tmp3:0:10}: bad substitution
Reply With Quote
  #6 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: Oct 2006
Location: Belgium
Posts: 170
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Solution with gawk:

Code:
#!/usr/bin/awk -f
BEGIN {FS=","; OFS=","}
{
          print \
                  gensub(/^.+([0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]).+$/, "\\1", 1, $2),
                  gensub(/^.+subject=BMRA.BM.(.+).FPN/, "\\1", 1, $1),
                  gensub(/^TS=(.+):GMT/, "\\1", 1, $5),
                  gensub(/^SP=(.+)/, "\\1", 1, $3),
                  gensub(/^VP=(.+)/, "\\1", 1, $6)
}
If your version of awk doesn't support gensub(), there is a solution with substr() and match(). Let me know.
Reply With Quote
  #7 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: May 2008
Posts: 4
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Hi Ripat,

Thanks for the reply. Unfortunately I am not able to run the solution. I think the gensub is not supported. Can you send me the other solution that you have with substr() and match().

Thanks
Reply With Quote
  #8 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: Jan 2007
Posts: 24
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Is not the nicer solution, but you can try:

Code:
#! /usr/bin/ksh
#set -x
touch file_1
touch file_2
touch large
touch final
if [ -f large ]
then
cat /dev/null > final
fi
if [ -f final ]
then
cat /dev/null > final
fi
cat /dev/null > file_1
cat /dev/null > file_2
while read line
do
        echo $line | grep subject >> file_1
        echo $line | grep message >> file_2
done < $1
paste file_2 file_1 > large
while read line
do
SUB=$(echo $line | awk -F"." '{print $5}')
SD=$(echo $line | cut -b13-22 | sed 's/:/-/g')
SP=$(echo $line | cut -b40)
TS=$(echo $line |cut -b84-99 | sed 's/:/-/' | sed 's/:/-/')
VP=$(echo $line |cut -b112-114)
echo "$SD,$SUB,$TS,$SP,$VP" >> final
done < large
rm large file_1 file_2

_________

- execution ./script file_name
- the result is the file final
Reply With Quote
  #9 (permalink)  
Old 05-16-2008
Registered User
 

Join Date: Oct 2006
Location: Belgium
Posts: 170
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
Without gensub:

Code:
#!/usr/bin/awk -f
BEGIN {FS=","; OFS=","}
{
                one = substr($1, match($1, /subject=BMRA.BM./) + 17, 10)
                sub(/\..+/, "", one)
                print substr($2, match($2, /[0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]/), 10),
                one, 
                substr($5, match($5, /TS=/) + 3, 19),
                substr($3, match($3, /SP=/) + 3),
                substr($6, match($6, /VP=/) + 3)
}
Reply With Quote
  #10 (permalink)  
Old 05-16-2008
Moderator
 

Join Date: Dec 2003
Location: /ksh93
Posts: 777
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Reddit! Stumble this Post!Spurl this Post!
[quote=nathasha;302195904
tmp3A=${tmp3:0:10}: bad substitution[/QUOTE]

Then you either are not using a recent version of ksh93 (later than 1999) or your data structure has changed from what you provided as a sample.
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes


The 50 most popular UNIX and Linux searches.
Google Search Cloud for The UNIX and Linux Forums
"inappropriate ioctl for device" 421 service not available, remote server has closed connection ascii eof autosys awk trim bash eval bash exec bash for loop boot: cannot open kernel/sparcv9/unix close_wait command copy/move folder in unix curses.h cut command in unix dead.letter find grep find null character in a unix file grep multiple lines grep or grep recursive grep unique inaddr_any inappropriate ioctl for device logrotate.conf lynx javascript mailx attachment mget mtime perl array length ping port remove first character from string in k shell replace space by comma , perl script scp recursive segmentation fault(coredump) sftp batch sftp script snoop unix stale nfs file handle syn_sent tar exclude unix unix com unix forum unix forums unix interview questions unix mtime unix simulator unix.com vi tab size vi+substitute+end+of+line+character while loop within while loop shell script


All times are GMT -7. The time now is 12:43 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101