Visit Our UNIX and Linux User Community


Parse a string in XML file using shell script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parse a string in XML file using shell script
# 15  
Old 11-14-2007
Parse a string in XML file using shell script

Hi Matrixmadhan,

It's working! You're really great! I've been looking for scripts for a long time on how I can do it and it's working with the solution you've provided. Thanks so much! If it's not too much, can you please help me on how I modify the script that you provided to have the output like the one below? Thanks in advance!

expected output:

date time chdate chtime status calling cparty
20071009 12:45:36 20071009 12:45:43 201 644 xxxxxxx
20071010 03:09:13 20071010 03:10:07 29 644 xxxxxxx
# 16  
Old 11-14-2007
based on your input,

input
Code:
>cat a
<?xml version="1.0"?><message><cdr version="1.0"><appid>testbed</appid><threadid>6</threadid><origin>node1</origin><date>20071009</date><time>12:45:36</time><chdate>20071009</chdate><chtime>12:45:43</chtime><status>201</status><type>103</type><calling>644</calling><cparty>xxxxxxx</cparty><accnum>xxxxxx</accnum><debirate1>0.0</debirate1><cos>-1</cos><strtbal>0.0</strtbal><freesms>0</freesms><tuc>0</tuc><fandftype></fandftype></cdr></message>
<?xml version="1.0"?><message><cdr version="1.0"><appid>testbed</appid><threadid>6</threadid><origin>node1</origin><date>20071009</date><time>12:45:36</time><chdate>20071009</chdate><chtime>12:45:43</chtime><status>201</status><type>103</type><calling>644</calling><cparty>xxxxxxx</cparty><accnum>xxxxxx</accnum><debirate1>0.0</debirate1><cos>-1</cos><strtbal>0.0</strtbal><freesms>0</freesms><tuc>0</tuc><fandftype></fandftype></cdr></message>

script
Code:
#! /opt/third-party/bin/perl

open(FILE, "<", "a");

while(<FILE>) {
  chomp;
  my @arr = split(/></);
  foreach (@arr) {
    if( />/ && /</ ) {
      if( $. == 1 ) {
        s/(.*)>(.*)<.*$/\1|\2/;
        my($tmp1, $tmp2) = split(/\|/);
        $data .= (" " . $tmp2);
        printf "%s ", $tmp1;
      }
      else {
        s/(.*)>(.*)<.*$/\2/;
        printf "%s ", $_;
      }
    }
  }
  print "\n";
  print "$data\n" if( $. == 1 );
}

close(FILE);

exit 0

output
Code:
appid threadid origin date time chdate chtime status type calling cparty accnum debirate1 cos strtbal freesms tuc
 testbed 6 node1 20071009 12:45:36 20071009 12:45:43 201 103 644 xxxxxxx xxxxxx 0.0 -1 0.0 0 0
testbed 6 node1 20071009 12:45:36 20071009 12:45:43 201 103 644 xxxxxxx xxxxxx 0.0 -1 0.0 0 0

# 17  
Old 11-15-2007
Parse a string in XML file using shell script

Hi Matrixmadhan,

Thanks for taking time to help me with my problem. Smilie I tried the solution that you've provided but the result is different. Can we just have one heading like the expected output below? Also if you can explain what the script does. Thanks a lot! I really appreciate all your help!

expected output:
date time chdate chtime status calling cparty
20071009 12:45:36 20071009 12:45:43 201 644 xxxxxxx
20071010 03:09:13 20071010 03:10:07 29 644 xxxxxxx


output of the script you've provided:
date time chdate chtime status calling cparty date time chdate chtime status calling cparty 20071009 12:45:36 20071009 12:45:43 201 644 xxxxxxx 20071010 03:09:13 20071010 03:10:07 29 644 xxxxxxx
# 18  
Old 11-15-2007
I checked it again and its working as expected.

May be the input format that we had used might be slightly different, or some bug in the script ? Smilie

Could you please post the input file that you had used ( the one with 2 records ) ?

I could take a look again.
# 19  
Old 11-15-2007
Parse a string in XML file using shell script

Hi Matrixmadhan,

For the sample that you provided it's working. But I use the actual input which is more than 5Mb of file. When I run using the script, it's output is different like what I mentioned in my previous post. I have attached a portion of the file since it's more than 5Mb I can't send it. Thanks again! Smilie
# 20  
Old 11-15-2007
Quote:
Also if you can explain what the script does. Thanks a lot!
Before explaining the script, it was written on the run - so its definitely not the optimized one Smilie

Code:
open(FILE, "<", "a");

open the file - as simple as the code explains

Code:
while(<FILE>) {
  chomp;
  my @arr = split(/></);

based on the delimiter '><' split the input record and populate in the array '@arr'

Code:
foreach (@arr) {
    if( />/ && /</ ) {

iterate through the array and make sure processing proceeds only when both '>' and '<' are available. Because we are interested only in that data really

Code:
if( $. == 1 ) {
        s/(.*)>(.*)<.*$/\1|\2/;
        my($tmp1, $tmp2) = split(/\|/);
        $data .= (" " . $tmp2);
        printf "%s ", $tmp1;
      }

if its the first line, only then header has to be printed and not for consequent xml records. Block the input data by 'grouping' and mark the block as '\1' and '\2'
append the header and data individually to a variable
Code:
 else {
        s/(.*)>(.*)<.*$/\2/;
        printf "%s ", $_;
      }
    }
  }

if its not the first line, concentrate only on printing the data and not the header
Code:
 print "\n";

this newline is needed; to make sure data and header information is not clubbed together
Code:
 print "$data\n" if( $. == 1 );

now print the header if its the first line
Code:
}

close(FILE);

close the file.


Hope this explains the logic ! Smilie
# 21  
Old 11-15-2007
Parse a string in XML file using shell script

Hi Matrixmadhan,

Thanks for the explanation for the scripts. I don't know why when I use the actual input file the output is not organized. What I wanted is to have one heading and then the values. when I tried the script, the headings are repeated several times and the values are under it. Please see details below. Hope you can help me organize the output.

expected output:
date time chdate chtime status calling cparty
20071009 12:45:36 20071009 12:45:43 201 644 xxxxxxx
20071010 03:09:13 20071010 03:10:07 29 644 xxxxxxx

output from scripts if the actual file is being used (more than 5MB file):
date time chdate chtime status calling cparty date time chdate chtime status calling cparty date time chdate chtime status calling cparty date time chdate chtime status calling cparty 20071009 12:45:36 20071009 12:45:43 201 644 xxxxxxx 20071010 03:09:13 20071010 03:10:07 29 644 xxxxxxx 20071010 03:09:13 20071010 03:10:07 29 644 xxxxxxx 20071009 12:45:36 20071009 12:45:43 201 644 xxxxxxx

Previous Thread | Next Thread
Test Your Knowledge in Computers #243
Difficulty: Easy
The starting point for host-to-host communication on the ARPANET in 1969 was the 1822 protocol.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parse xml in shell script and extract records with specific condition

Hi I have xml file with multiple records and would like to extract records from xml with specific condition if specific tag is present extract entire row otherwise skip . <logentry revision="21510"> <author>mantest</author> <date>2015-02-27</date> <QC_ID>334566</QC_ID>... (12 Replies)
Discussion started by: madankumar.t@hp
12 Replies

2. Shell Programming and Scripting

Using shell command need to parse multiple nested tag value of a XML file

I have this XML file - <gp> <mms>1110012</mms> <tg>988</tg> <mm>LongTime</mm> <lv> <lkid>StartEle=ONE, Desti = Motion</lkid> <kk>12</kk> </lv> <lv> <lkid>StartEle=ONE, Source = Velocity</lkid> <kk>2</kk> </lv> <lv> ... (3 Replies)
Discussion started by: NeedASolution
3 Replies

3. Shell Programming and Scripting

How to Parse the XML data along with the URL in Shell Script?

Hi, Can anybody help to solve this. I want to parse some xmldata along with the URL in the Shell. I'm calling the URL via the curl command Given below is my shell script file export... (7 Replies)
Discussion started by: Megala
7 Replies

4. Shell Programming and Scripting

How to parse xml file in variable-string?

In the wake of the post: how-parse-following-xml-file Thank you for the very useful chakrapani response 302355585-post4 ! A close question. How to pass a file to xmllint in variable? For example, let it be: NEARLY_FILE='<?xml version="1.0" encoding="iso-8859-1"?><html><set label="09/07/29"... (0 Replies)
Discussion started by: OleM2k
0 Replies

5. Shell Programming and Scripting

Shell script (not Perl) to parse xml with awk

Hi, I have to make an script according to these: - I have couples of files like: xxxxxxxxxxxxx.csv xxxxxxxxxxxxx_desc.xml - every xml file has diferent fields, but keeps this format: ........ <defaultName>2011-02-25T16:43:43.582Z</defaultName> ........... (2 Replies)
Discussion started by: Pluff
2 Replies

6. Shell Programming and Scripting

Parse XML file in shell script

Hi Everybody, I have an XML file containing some data and i want to extract it, but the specific issue in my file is that the data is repeated some times like the following example : <section1> <subsection1> X=... Y=... Z=... <\subsection1> <subsection2> X=... Y=... Z=...... (2 Replies)
Discussion started by: yassine
2 Replies

7. Shell Programming and Scripting

regex/shell script to Parse through XML Records

Hi All, I have been working on something that doesn't seem to have a clear regex solution and I just wanted to run it by everyone to see if I could get some insight into the method of solving this problem. I have a flat text file that contains billing records for users, however the records... (5 Replies)
Discussion started by: Jerrad
5 Replies

8. Shell Programming and Scripting

Parse XML file into CSV with shell?

Hi, It's been a few years since college when I did stuff like this all the time. Can someone help me figure out how to best tackle this problem? I need to parse a file full of entries that look like this: <eq action="A" sectyType="0" symbol="PGR" exch="CA" curr="VEF" sess="NORM"... (7 Replies)
Discussion started by: Pcushing
7 Replies

9. Shell Programming and Scripting

Need help in creating a Unix Script to parse xml file

Hi All, My requirement is create an unix script to parse the xml file and display the values of the Elements/value between the tags on console. Like say, I would like to fetch the value of errorCode from the below xml which is 'U007' and display it. Can we use SED command for this? I have tried... (10 Replies)
Discussion started by: Anil.Wmg
10 Replies

10. Shell Programming and Scripting

Parse String in XML file

Hello All, I am new to this and I need to parse an XML file. Here's the XML Input File: <Report version="1.2"> <summary fatals="0" testcases="1" expected_fails="0" unexpected_passes="0" warnings="9" tests="21" errors="0" fails="1" passes="20" /> <testresult... (4 Replies)
Discussion started by: racbern
4 Replies

Featured Tech Videos