Help with parsing xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with parsing xml file
# 1  
Old 12-04-2015
Help with parsing xml file

Hi,
Need help with parsing xml data in unix and place it in a csv file. My xml file looks like this:

Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<iwgroups>
    <nextid value="128">
    </nextid>
    <iwgroup name="RXapproval" id="124" display-name="RXapproval" description="group for RX approval ">
        <user name="iwov">
        </user>
        <user name="m161595">
        </user>
        <user name="m594670">
        </user>
        <work name="iwov">
        </work>
    </iwgroup>
    <iwgroup name="TEXT_EMAIL" id="113" display-name="" description="TEXT email group with permission to start text pool workflow">
        <user name="iwov">
        </user>
        <user name="m123053">
        </user>
        <user name="m270857">
        </user>
        <user name="m363836">
       </user>

I would need to place the output in a csv file in below format:

Code:
                 Groupname,          Description,                                                                                Users
RXapproval,        group for RX approval,                                                         iwov m161595 m594670
TEXT_EMAIL,  TEXT email group with permission to start text pool workflow,    iwov m123053 m270857 m363836

Please let me know on what is the best way to do this.

Thanks
Jay
Moderator's Comments:
Mod Comment Please use CODE tags for sample input, sample output (including output formats), and code.

Last edited by Don Cragun; 12-04-2015 at 08:35 PM.. Reason: Add CODE tags.
# 2  
Old 12-05-2015
You have asked for our help fourteen times. What have you tried to solve this problem?

An XML file usually has matching open and close tags. The sample input you have provided has one opening iwgroups tag and no matching closing tag. The sample input you have provided has two opening iwgroup tags, but only one closing tag. It would seem that the closing iwgroup tag would be the trigger that should terminate processing of an XML input segment and produce an output line in your desired CSV file. How can we be expected to suggest code if you can't give us consistent sample input and output?

A CSV file (with comma as the character separating values) doesn't usually have LOTS of leading and/or trailing spaces in the data. Why are there dozens of seemingly extraneous spaces in your desired sample output?

A quoted field in an XML file usually specifies that the data between the quotes should be the contents of the output field you want in your CSV file. The first description field in your XML file is:
Code:
description="group for RX approval "

(note the trailing <space> after approval), but the output you say you want for that field is:
Code:
        group for RX approval

(note the eight leading <space> characters that are not present in the XML file and the missing trailing space after approval).

If you want our help on this you need to very clearly describe the output format you are trying to produce; not just show us a desired CSV file that does not seem to match the sample XML file you have provided.

Please help us help you!
# 3  
Old 12-05-2015
Hi Don,
Sorry for the confusion. Please find my input below:

Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<iwgroups>
    <nextid value="128">
    </nextid>
    <iwgroup name="AEapproval" id="124" display-name="AEapproval" description="group for AE approval">
        <user name="iwov">
        </user>
        <user name="m161595">
        </user>
        <user name="m594670">
        </user>
        <user name="m803051">
        </user>
    </iwgroup>
    <iwgroup name="OES_EMAIL" id="113" display-name="" description="OES email group with permission to start oes pool workflow">
        <user name="iwov">
        </user>
        <user name="m123053">
        </user>
        <user name="m270857">
        </user>
        <user name="m363836">
        </user>
     </iwgroup>
</iwgroups>

I just need a report with the following fields:
Code:
Groupname1:AEapproval
Description: group for AE approval
Users: iwov, m161595, m594670,m803051

Groupname2:OES_EMAIL
Description: OES email group with permission to start oes pool workflow
Users: iwov,m123053,m270857,m363836

Output may not be in a csv file, a text file will also do in the above format. Hope I am clear this time.

Thanks
Ajay
# 4  
Old 12-05-2015
A CSV file output is easy to do; it just doesn't look anything like you said you wanted in your original post.
Code:
awk '
BEGIN {
	print "Groupname,Description,Users"
	OFS = ","
}
/<iwgroup / {
	groupname = description = $0
	sub(/^.* name="/, "", groupname)
	sub(/".*/, "", groupname)
	sub(/^.* description="/, "", description)
	sub(/".*/, "", description)
}
/<user / {
	gsub(/^.[^"]*"|"[^"]*$/, "")
	users = users == "" ? $0 : users " " $0
}
/<\/iwgroup>/ {
	print groupname, description, users
	users = ""
}' file

producing:
Code:
Groupname,Description,Users
AEapproval,group for AE approval,iwov m161595 m594670 m803051
OES_EMAIL,OES email group with permission to start yes pool workflow,iwov m123053 m270857 m363836

with your latest sample input.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 12-06-2015
Thanks Don. It works exactly as per my requirement.

Thanks
Ajay
# 6  
Old 12-06-2015
If you would like to have it formatted as post #3, the following is a possible Perl solution:

Code:
#!/usr/bin/env perl
use strict;
use warnings;

my @group = ();
my $groupnum = 0;

while(<>){
    if(/<iwgroup name/) {
        my ($name, $description) = /\bname="([^"]*).+\bdescription="([^"]*)/;
        $groupnum++;
        my $meta = "Groupname$groupnum: $name\nDescription: $description\n";
        push @group, [$meta];
        next;
    }
    if(exists $group[0]){
        if(/<user name="([^"]*)/){
            push @{$group[1]}, $1;
        }
    }
    if(/<\/iwgroup>/){
        print "\n" unless $groupnum == 1;
        print "$group[0]->[0]Users: ", join(", ", @{$group[1]}), "\n";
        @group = ();
    }
}

Save as report.pl
Run as perl report.pl ajayakunuri.file > ajayakunuri.report

Code:
perl report.pl ajayakunuri.file
Groupname1: AEapproval
Description: group for AE approval
Users: iwov, m161595, m594670, m803051

Groupname2: OES_EMAIL
Description: OES email group with permission to start oes pool workflow
Users: iwov, m123053, m270857, m363836

This User Gave Thanks to Aia For This Post:
# 7  
Old 12-08-2015
Thanks Aia for your help.

Thanks
Ajay
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Parsing XML file

I want to parse xml file sample file....... <name locale="en">my_name<>/name><lastChanged>somedate</lastChanged><some more code here> <name locale="en">tablename1<>/name><lastChanged>somedate</lastChanged> <definition><dbquery><sources><sql type="cognos">select * from... (10 Replies)
Discussion started by: ms2001
10 Replies

2. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

3. Shell Programming and Scripting

Help in parsing XML output file in perl.

Hi I have an XML output like : <?xml version="1.0" encoding="ISO-8859-1" ?> - <envelope> - <body> - <outputGetUsageSummary> - <usgSumm rerateDone="5"> - <usageAccum accumId="269" accumCaptn="VD_DP_AR" inclUnits="9999999.00" inclUnitsUsed="0.00" shared="false" pooled="false"... (7 Replies)
Discussion started by: rkrish
7 Replies

4. Shell Programming and Scripting

Parsing an XML file

Hello, I have the following xml file as an input. <?xml version="1.0" encoding="UTF-8"?> <RECORDS PS3_VERSION="1104_01"><RECORD> <POI_ID>931</POI_ID> <SUPPLIER_ID>2</SUPPLIER_ID> <POI_PVID>997920846</POI_PVID> <DB_ID>1366650925</DB_ID> <REGION>H1</REGION> <POI_NAME NAME_TYPE="Official"... (4 Replies)
Discussion started by: ramky79
4 Replies

5. Shell Programming and Scripting

parsing xml file

Hello! We need to parse weblogic config.xml file and display rows in format: machine:listen-port:name:application_name In our enviroment the output should be (one line for every instance): Crm-Test-Web:8001:PIA:peoplesoft Crm-Test-Web:8011:PIA:peoplesoft... (9 Replies)
Discussion started by: annar
9 Replies

6. Shell Programming and Scripting

Help in parsing xml file (sed/nawk)

I have a large xml file as shown below: <input> <blah> <blah> <atr="blah blah value = ""> <blah> <blah> </input> ..2nd chunk... ..3rd chunk... ...4th chunk... All lines between <input> and </input> is one 'order' and this 'order' is repeated... (14 Replies)
Discussion started by: shekhar2010us
14 Replies

7. Shell Programming and Scripting

Parsing xml file

hi guys, great help to the original question, can i expand please? i have large files filled with blocks like this <Placemark> network type: hot line1 line2 line3 <styleUrl>red.png</styleUrl> </Placemark> <Placemark> network type: cold line1 line2 line3... (3 Replies)
Discussion started by: garvald
3 Replies

8. UNIX for Dummies Questions & Answers

Help parsing a XML file ....

Well I have read several threads on the subject ... but being a newbie like me makes it hard to understand ... What I need is the following: Input data: ------- snip --------- <FavouriteLocations> <FavouriteLocations class="FavouriteList"><Item... (6 Replies)
Discussion started by: misak
6 Replies

9. Shell Programming and Scripting

XML file parsing using script

Hi I need some help with XML file parsing. I have an XML file with the below tag, I need a script to identify the value of srvcName which is this case is "AAA srvc name". I need to put contents of this value which is AAA srvc and name into different variables using an array and then reformat it... (6 Replies)
Discussion started by: zmfcat1
6 Replies

10. UNIX for Advanced & Expert Users

Parsing xml file using Sed

Hi All, I have this(.xml) file as: <!-- define your instance here --> <instance name='ins_C2Londondev' user='' group='' fullname='B2%20-%20London%20(dev)' > <property> </property> </instance> I want output as: <!-- define your instance here --> <instance... (3 Replies)
Discussion started by: kapilkinha
3 Replies
Login or Register to Ask a Question