Sponsored Content
Full Discussion: Need some help with parsing
Top Forums Shell Programming and Scripting Need some help with parsing Post 302210167 by quixoticking11 on Monday 30th of June 2008 08:49:08 AM
Old 06-30-2008
Need some help with parsing

I have a big xml file with little formatting in it. It contains over 600 messages that I need to break each message out in its own separate file.
The xml file looks in the middle of it something like this:

</Title></Msg><Msg><Opener> Hello how
are you?<Title> Some says hello</Title><Body>
This is a test to see how everything is
going. I need your help.</Body></Msg><Msg>
<Open1> An opening.</Open1><Title> Trying
something new.</Title><Report>124555ABC
</Report><Body> Another test for me.</Body>
<PS> I need to figure this out.</PS></Msg>
<Msg> etc........ etc... etc..
.......etc. .......

Some caveats:

1. The messages always start with <Msg>
2. The messages always ends with </Msg>
3. The <Msg> could be at the beginning, middle or end of a line.
4. The </Msg> could be at the beginning, middle or end of a line.
5. There can be different amount of tag in a line i.e. <Title><Body>,etc...
6. Message could be one to 100+ lines long.

Any suggestion on breaking each message from this xml file into its own file. Any sed/awk/nawk shell function/statements would be appreciated.
In the end, there is 600+ messages so there should be 600+ files.

Thank you.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

parsing

Hi, I want to parse this file.... ( 0 , 0 ) =>heading1 ( 0 , 1 ) =>value1.1a ( 0 , 2 ) =>value2.1a ( 1 , 0 ) =>heading2 ( 1 , 1 ) =>value1.1b ( 1 , 2 ) =>value2.1b ( 2 , 0 ) =>heading3 ( 2 , 1 ) =>value1.1c ( 2 , 2 ) =>value2.1c ( 3 , 0 ) =>heading4 ( 3 , 1 ) =>value1.1d ( 3 , 2... (15 Replies)
Discussion started by: tungaw2004
15 Replies

2. Shell Programming and Scripting

Parsing Binary

I have a binary file a particular format. It contains the Length Bytes and the Type bytes i.e the first four bytes if the file indicate the length of the Type which is to follow. for eg, if the int value of the first four bytes is 80, then it means that the length of the following "Type" is 80.... (2 Replies)
Discussion started by: xgringo
2 Replies

3. Shell Programming and Scripting

Perl parsing compared to Ksh parsing

#! /usr/local/bin/perl -w $ip = "$ARGV"; $rw = "$ARGV"; $snmpg = "/usr/local/bin/snmpbulkget -v2c -Cn1 -Cn2 -Os -c $rw"; $snmpw = "/usr/local/bin/snmpwalk -Os -c $rw"; $syst=`$snmpg $ip system sysName sysObjectID`; sysDescr.0 = STRING: Cisco Internetwork Operating System Software... (1 Reply)
Discussion started by: popeye
1 Replies

4. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

5. Shell Programming and Scripting

iterative parsing

I have always struggled when parsing a file vertically vs. by line horizontally. Can't seem to get my head around the concept. Here again I need to convert vertical output to horizontal output. original output root@acuransx:bpplsched 2000-STAND3 -v -M acuransx -l <2>bpplsched: INITIATING:... (4 Replies)
Discussion started by: jouuu
4 Replies

6. Shell Programming and Scripting

Need help with parsing a file

I trying to get only the highest version of the file names from an file which has list of file names. EX: CATEGORYDISPLAY JSP.A-SRC_BLD;2.4 CATEGORYDISPLAY JSP.A-SRC_BLD;2.5 CATEGORYDISPLAY JSP.A-SRC_BLD;2.1 CATEGORYDISPLAY JSP.A-SRC_BLD;2.2 The Script should display only the highest... (7 Replies)
Discussion started by: rocker_me2002
7 Replies

7. Shell Programming and Scripting

Parsing through Awk

Hi All, I have an input file something like this: Line1 Line2 .... LineN Identifier ( Field1a, Field1b; Field2a, Field1b; Field3a, Field1b; ..... ) LineN+1 LineN+2 etc.. I basically need Field1a, Field2a, Field3a.... from the above file (6 Replies)
Discussion started by: tostay2003
6 Replies

8. Shell Programming and Scripting

sed (parsing value)

All, Can somebody provide me with some sed expertise on how to parse the following line. 27-MAR-2011 10:28:01 * (CONNECT_DATA=(SID=dmart)(CID=(PROGRAM=sqlplus)(HOST=mtasnprod1)(USER=mtasnord))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.197.7.47)(PORT=54881)) * establish * dmart * 0 I would like... (3 Replies)
Discussion started by: BeefStu
3 Replies

9. Shell Programming and Scripting

Parsing the file

Hi all, Could anyone help me out? My input file is like: M1SYSNPENGGQINDNINYSE21PTMLPENLSLSNYDMDSFLGQFPSDNNMQLPHSTYEQHLQGEQQNPTNPNYFPPEFDEN2VDW1QE2 output is: M1 M1SYSNPENGGQINDNINYSE2 M1SYSNPENGGQINDNINYSE21 SYSNPENGGQINDNINYSE2 SYSNPENGGQINDNINYSE21 ... (2 Replies)
Discussion started by: yinyuemi
2 Replies

10. Shell Programming and Scripting

parsing

Can some body show me a sed command to remove everyhing upto a '/' and leave the rest of the line. cat data.out This is the directory /tmp/xxx/yy.ksh I only want to get the fullpath name /tmp/xxx.yy.ksh Thanks in advance to all who answer. (3 Replies)
Discussion started by: BeefStu
3 Replies
sortm(1)						      General Commands Manual							  sortm(1)

NAME
sortm - sort messages (only available within the message handling system, mh) SYNOPSIS
sortm [msgs] [+folder] [options] OPTIONS
Specifies the name of the header field to use when making the date comparison. If you have a special field in each message, such as Deliv- ery-Date:, then the -datefield switch can be used to tell sortm which field to examine. If you do not give this option, the default is to use the Date: header field. Prints a list of all the valid options to this command. Displays the general actions that it is taking to place the folder in sorted order. The -noverbose option performs these actions silently. The default is -noverbose. The default settings for this command are: +folder defaults to the current folder msgs defaults to all -datefield date -noverbose DESCRIPTION
The command sortm sorts all the messages in the current folder into chronological order according to the contents of the Date: fields of the messages. By default, sortm sorts all the messages in the current folder. You can select particular messages in the folder by giving a range of mes- sages. You can also sort messages in another folder by specifying the folder name. If sortm encounters a message without a Date: field, or if the message has a Date: field that sortm cannot parse, it attempts to keep the message in the same relative position. However, this does not always work; for instance, if the first message encountered lacks a date which can be parsed, then it will usually be placed at the end of the messages being sorted. When sortm complains about a message which it cannot order, it complains about the message number prior to sorting. PROFILE COMPONENTS
Path: To determine your MH directory EXAMPLES
The following example sorts all the messages in the folder +meetings: % sortm +meetings The next example sorts messages 10-30 in the folder called +test: % sortm +test 10-30 FILES
The user profile. SEE ALSO
folder(1) sortm(1)
All times are GMT -4. The time now is 12:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy