![]() |
|
|
|
|
|||||||
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. Shell Script Page. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| To extract everything between two delimiters | dowsed4u8 | SUN Solaris | 1 | 01-16-2008 10:49 AM |
| parse multiple lines? should be a easy answer... | DeuceLee | UNIX for Dummies Questions & Answers | 4 | 01-04-2008 12:54 PM |
| Delimiters missing | Indalecio | Shell Programming and Scripting | 2 | 02-23-2007 12:28 AM |
| Cut based on Two Delimiters at one go | pbsrinivas | Shell Programming and Scripting | 4 | 01-18-2007 12:45 AM |
| awk - treat multiple delimiters as one | peter.herlihy | Shell Programming and Scripting | 6 | 08-30-2002 09:12 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Display Modes |
|
|||
|
parse of lines with different delimiters
Hi,
I am having huge file with the following lines. 2007:10:01:00:00:49:GMT: subject=BMRA.BM.T_ABTH7.FPN, message={SD=2007:10:01:00:00:00:GMT,SP=5,NP=2,TS=2007:10:01:01:00:00:GMT,VP=0.0,TS=2007:10:01:01:30: 00:GMT,VP=0.0} 2007:10:01:00:00:49:GMT: subject=BMRA.BM.T_ABTH7G.FPN, message={SD=2007:10:01:00:00:00:GMT,SP=5,NP=2,TS=2007:10:01:01:00:00:GMT,VP=0.0,TS=2007:10:01:01:30: 00:GMT,VP=0.0} I need to parse them into the following format. 2007-10-01,T_ABTH7,2007-10-0100:00:00,5,0.0 2007-10-01,T_ABTH7G,2007-10-0100:00:00,5,0.0 Is there a way to parse the entire file without reading a single line of file and formating the output. Thanks in advance. |
| Forum Sponsor | ||
|
|
|
|||
|
Hi Penchal,
I want to parse the highlighted values. 2007:10:01:00:00:49:GMT: subject=BMRA.BM.T_ABTH7.FPN, message={SD=2007:10:01:00:00:00:GMT,SP=5,NP=2,TS=2007:10:01:01:00:00:GMT,VP=0.0,TS=2007:10:01:01:30: 00:GMT,VP=0.0} First output column (2007-10-01): SD=2007:10:01:00:00:00 Second column (T_ABTH7): subject=BMRA.BM.T_ABTH7.FPN Third Column (2007-10-0100:00:00): TS=2007:10:01:01:00:00 Fourth Column (5): SP=5 Fifth Column(0.0) : VP=0.0 Output for a single line : 2007-10-01,T_ABTH7,2007-10-0100:00:00,5,0.0 Please let me know if this clear. |
|
|||
|
One way is to use the pattern matching and substitution capabilities of ksh93 or bash
Code:
#!/usr/bin/ksh93
IFS=','
while read c1 c2 c3 c4 c5 c6 rest
do
tmp1=${c2##*SD=}
tmp1=${tmp1%%:00:00:00:GMT}
tmp2=${c1##*=BMRA.BM.}
tmp3=${c5##TS=}
tmp3=${tmp3%%:GMT}
tmp3A=${tmp3:0:10}
printf "%s,%s,%s%s,%s,%s\n" ${tmp1//:/-} ${tmp2%%.FPN} ${tmp3A//:/-} ${tmp3:11} ${c3##SP=} ${c6##VP=}
done < file
Code:
2007-10-01,T_ABTH7,2007-10-0101:00:00,5,0.0 2007-10-01,T_ABTH7G,2007-10-0101:00:00,5,0.0 |
|
|||
|
Solution with gawk:
Code:
#!/usr/bin/awk -f
BEGIN {FS=","; OFS=","}
{
print \
gensub(/^.+([0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]).+$/, "\\1", 1, $2),
gensub(/^.+subject=BMRA.BM.(.+).FPN/, "\\1", 1, $1),
gensub(/^TS=(.+):GMT/, "\\1", 1, $5),
gensub(/^SP=(.+)/, "\\1", 1, $3),
gensub(/^VP=(.+)/, "\\1", 1, $6)
}
|
|
|||
|
Is not the nicer solution, but you can try:
Code:
#! /usr/bin/ksh
#set -x
touch file_1
touch file_2
touch large
touch final
if [ -f large ]
then
cat /dev/null > final
fi
if [ -f final ]
then
cat /dev/null > final
fi
cat /dev/null > file_1
cat /dev/null > file_2
while read line
do
echo $line | grep subject >> file_1
echo $line | grep message >> file_2
done < $1
paste file_2 file_1 > large
while read line
do
SUB=$(echo $line | awk -F"." '{print $5}')
SD=$(echo $line | cut -b13-22 | sed 's/:/-/g')
SP=$(echo $line | cut -b40)
TS=$(echo $line |cut -b84-99 | sed 's/:/-/' | sed 's/:/-/')
VP=$(echo $line |cut -b112-114)
echo "$SD,$SUB,$TS,$SP,$VP" >> final
done < large
rm large file_1 file_2
_________ - execution ./script file_name - the result is the file final |
|
|||
|
Without gensub:
Code:
#!/usr/bin/awk -f
BEGIN {FS=","; OFS=","}
{
one = substr($1, match($1, /subject=BMRA.BM./) + 17, 10)
sub(/\..+/, "", one)
print substr($2, match($2, /[0-9][0-9][0-9][0-9]:[0-9][0-9]:[0-9][0-9]/), 10),
one,
substr($5, match($5, /TS=/) + 3, 19),
substr($3, match($3, /SP=/) + 3),
substr($6, match($6, /VP=/) + 3)
}
|
|
|||
|
[quote=nathasha;302195904
tmp3A=${tmp3:0:10}: bad substitution[/QUOTE] Then you either are not using a recent version of ksh93 (later than 1999) or your data structure has changed from what you provided as a sample. |
|||
| Google UNIX.COM |
| Thread Tools | |
| Display Modes | |
|
|