looking for expert sed/script help for translation/substitution


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting looking for expert sed/script help for translation/substitution
# 1  
Old 09-10-2010
looking for expert sed/script help for translation/substitution

Hi All,

I'm looking for some expert help on sed/script to work out the best way to transform one xml format into another however there are a few complexities around translation.

The extra complexities are to:
1) take the start and stop time (YYYYMMDDHHMMSS) and convert to start time to unix time plus output the difference in seconds between both times.
2) oid, tsid and sid are found by looking up an external file and finding the value against the channel. For example one of the lines in the file will be 2:806:27e2=channel1

Is there any way to write piped sed commands that can do this? If not, any ideas how the script should look like?

Thanks in advance.

Input File
Code:
<programme start="20100910060000 +0100" stop="20100910061000 +0100" channel="channel1">
<title lang="en">This is the title</title>
<desc>This is the description</desc>
</programme>

Output File
Code:
<service oid="0002" tsid="0806" sid="27e2">
<event id="0">
<name lang="OFF" string="This is the title"/>
<text lang="OFF" string="This is the description"/>
<time start_time="1284098400" duration="600"/>
</event>
</service>

Look up file for oid, tsid and sid
Code:
2:806:27e2=channel1
2:756:37a3=channel2
5:4a06:42e5=channel3


Last edited by hotbaws11; 09-10-2010 at 09:47 AM..
# 2  
Old 09-10-2010
how many records are in each file roughly?
# 3  
Old 09-10-2010
One of my files has 22k of programmes and I only expect to process 1 file per day.
# 4  
Old 09-11-2010
This should be enough to get you started down the right track.....

Code:
XMLINPUTFILE=`cat /tmp/xmlinputfile.txt`
OIDTSIDSIDREF=`cat /tmp/oidtsidsid.txt`
STARTTIME=`echo $XMLINPUTFILE | sed -e '/programme start/!d' -e 's/<programme start="//' -e 's/ .*//g'`
STOPTIME=`echo $XMLINPUTFILE | sed -e '/programme start/!d' -e 's/.*stop="//' -e 's/ .*//g'`
CHANNELNO=`echo $XMLINPUTFILE | sed -e '/programme start/!d' -e 's/..*channel="//' -e 's/".*//'`
OIDCHANNEL=`echo "$OIDTSIDSIDREF" | grep $CHANNELNO `

OIDVAL=`echo "$OIDCHANNEL" | sed -e 's/:.*//' -e 's/abcdef/ABCDEF/g'`
TSIDVAL=`echo "$OIDCHANNEL" | sed -e 's/^[0-9]*://' -e 's/:.*//' -e 's/abcdef/ABCDEF/g'`
SIDVAL=`echo "$OIDCHANNEL" | sed -e 's/[0-9].*://' -e 's/=chan.*//' -e 'y/abcdef/ABCDEF/'`

OIDVAL=`echo "ibase=16; $OIDVAL" | bc`
TSIDVAL=`echo "ibase=16; $TSIDVAL" | bc`
SIDVAL=`echo "ibase=16; $SIDVAL" | bc`

OIDVAL=`printf "%04X" $(echo "$OIDVAL + 0" | bc -l ) | sed -e 'y/ABCDEF/abcdef/'`
TSIDVAL=`printf "%04X" $(echo "$TSIDVAL + 0" | bc -l ) | sed -e 'y/ABCDEF/abcdef/'`
SIDVAL=`printf "%04X" $(echo "$SIDVAL + 0" | bc -l ) | sed -e 'y/ABCDEF/abcdef/'`

STARTSTRING=`echo $STARTTIME | sed -e 's/...[0-9]/&-/' -e 's/.*-.[0-9]/&-/' -e 's/.*-.[0-9]/& /' -e 's/..* [0-9][0-9]/&:/' -e 's/..*:../&:/'`
STOPSTRING=`echo $STOPTIME | sed -e 's/...[0-9]/&-/' -e 's/.*-.[0-9]/&-/' -e 's/.*-.[0-9]/& /' -e 's/..* [0-9][0-9]/&:/' -e 's/..*:../&:/'`
STARTUNIX=`date -d "$STARTSTRING" +%s`
STOPUNIX=`date -d "$STOPSTRING" +%s`
DIFFERENCE=`expr $STOPUNIX - $STARTUNIX`

echo "<service oid="'"'"$OIDVAL"'"'" tsid="'"'"$TSIDVAL"'"'" sid="'"'"$SIDVAL"'"'">"
echo "<time start_time="'"'"$STARTUNIX"'"'" duration="'"'"$DIFFERENCE"'"'"/>"

Have fun

---------- Post updated at 04:46 AM ---------- Previous update was at 04:41 AM ----------

I just noticed a typo....

On the first instance of OIDVAL and TSIDVAL the end 's/abcdef/ABCDEF/g' should read 'y/abcdef/ABCDEF/' as in the first SIDVAL line.
# 5  
Old 09-12-2010
I think it's best to use Perl and XML::Simple for that. Awk has limitations with non-multi-line need-parser-type based data.

---------- Post updated at 04:43 PM ---------- Previous update was at 04:42 PM ----------

I'm not also sure but maybe this can also help: xmlsh
# 6  
Old 09-12-2010
thanks for the response guys. Basically I need a shell script (sh) as a) that's all I know and b) I think that is the only thing available on the box I'll be running this on. Again I have no idea how to install any new languages onto the box.

I've managed to get quite good performance (less than an hour) by:
1) extracting all the information out of the source file and into 'raw' format by using a multiple sed commands. Raw format is something like event_num~channel~startdate~enddate~title~desc
2) Then using usual read file method to read each line and output it in xml file format. I need the read file method to a) convert startdate/enddate into unix time. I can't get this to work in sed itself (i.e. taking a sed variable and passing it to date function). b) lookup the external file for sid,onid,tsid.

I don't have the script handy, otherwise I would post it here. Again, if you have any suggestions on how I could improve my sed statements to do a) and b) above then that would make it even faster Smilie
# 7  
Old 09-12-2010
If you look at the script I posted you'll see it already does the unix time conversion.

Look at the very top and you'll see the two required filenames in the /tmp directory.

copy your xmlfile and the oidtsidsid values file into those names

run the script

you'll get the service oid line and the starttime and duration as outputs.

runtime is about 3 seconds.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

regarding TCL expert SSH script

Hi, I am in learning curve in the Tcl/expert language. I have the following program for ssh #!/dsbutest/autotest/ats_current/bin/expect set UserID "user" set Password "key" set remotehost spawn ssh -l $UserID $remotehost expect -re ".*" send "yes\r" #this one adding the RSA key... (1 Reply)
Discussion started by: mathiaga
1 Replies

2. Shell Programming and Scripting

sed variable substitution in a script

Hi I am trying to do the following in a script find a string and add in a block of text two lines above on the command line this works fine #/usr/bin/cat /usr/local/etc/dhcpd.conf_subnet | /usr/xpg4/bin/sed -n -e '1h;1\!H;${;g;s/}.*#END of 10.42.33.0/#START of RANGE $dstart\:option... (3 Replies)
Discussion started by: eeisken
3 Replies

3. UNIX for Dummies Questions & Answers

sed substitution

How can you use sed with a line of code that reads: 67899:Bill:Williams:Maple Dr.:45908600 Let us say we want to replace Maple Dr. with Oak St. (1 Reply)
Discussion started by: yonkers062986
1 Replies

4. Shell Programming and Scripting

Substitution with sed

I have a file with some numbers having single quotes around them which I want to remove. i.e. '923930' -> 23930 If it can be done without using sed thats fine. I have tried with sed but can't think how to replace this pattern on only the numbers (13 Replies)
Discussion started by: user_invalid
13 Replies

5. UNIX for Dummies Questions & Answers

Help with sed substitution

I'm a noob to unix, and I have a line of data like the following: title=Boston|tcolor=green|desc=Large city in New England|url=www.boston.com Is there a way to change a field value with sed substitution? (i.e. change tcolor=green to tcolor=blue) I figured out: sed... (19 Replies)
Discussion started by: stabby
19 Replies

6. Shell Programming and Scripting

sed substitution

Using sed I'm trying to replace 'string' with ']' while retaining case and ignoring words with 'string' in it along with additional characters like 'strings' and those which already contain the ] wrapper. I'm hoping to do it with sed and the right expression, if possible. Example: Apple... (2 Replies)
Discussion started by: tom.lee
2 Replies

7. Shell Programming and Scripting

sed substitution

Hi I am trying to do a text insertion in a text file at a particular line number in a shell script. However its not working. sed '122i\ > for j in \`echo $MyList\` ; do perl -pi -e\'s#01\/01\/2009#01\/01\/2011#\' $j ; done' $HOME/MyScript.ksh The Actual line to be inserted at line 122... (5 Replies)
Discussion started by: som.nitk
5 Replies

8. Shell Programming and Scripting

SED: language translation of a program

Hi people! I'm a bit noob at sed so I ask you for some help :( I'm trying to automatically translate some files of a program, which has some lines of this style: $string = 'A text line'; For example in this line: $string = 'The date entered: <strong>$a</strong> does not correspond... (2 Replies)
Discussion started by: pdmiguel
2 Replies

9. Shell Programming and Scripting

SED Substitution

Hi , I am stuck up in the below scenario:- I need to read a file name (eg A.txt) name frm another file (eg B.txt) and then I need to search for a particular expression in A.txt and substitute it with another expression. How can I use SED inside SHELL Scripting and command prompt as well to... (1 Reply)
Discussion started by: shubhranshu
1 Replies

10. Shell Programming and Scripting

Substitution using SED

Hi , I am stuck up in the below scenario:- I need to read a file name (eg A.txt) name frm another file (eg B.txt) and then I need to search for a particular expression in A.txt and substitute it with another expression. How can I use SED inside SHELL Scripting and command prompt as... (2 Replies)
Discussion started by: shubhranshu
2 Replies
Login or Register to Ask a Question