xml to csv using sed and awk command


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers xml to csv using sed and awk command
# 1  
Old 02-27-2012
xml to csv using sed and awk command

Hi Guys,

Can you help me in creating shell script using sed,awk etc commands to generate csv file using xml file.
# 2  
Old 02-27-2012
Probably but you'll need to be more specific. XML can hold nested structures that make no sense to try and convert to CSV, and even given a simple structure, arbitrary XML just isn't trivial to convert to a flatfile.

So show the exact XML you have and the exact output you want please.
# 3  
Old 03-01-2012
Thanks Corona688 for reply.

Sorry I was not clear in my 1st post.

test xml as below :

Code:
<JobStatus>
   <DataDate> 2011-06-01 </DataDate> 
   <StartTime>2011-11-25T00:03:28</StartTime> 
   <JobTypeCode>12002</JobTypeCode> 
   <StatusCode>0</StatusCode> 
   <JobParameterCode>103</JobParameterCode> 
   <JobParameterValue>201</JobParameterValue> 
</JobStatus>

As of now, NO multi-structured xml as of now.

I can use perl XML:simple module, but I want to use only unix commands like sed/awk etc .

Thanks

Moderator's Comments:
Mod Comment Please use code tags. Moved from HP-UX forum
# 4  
Old 03-01-2012
Do the "JobStatus" head/tail matter?

If not (untested):
Code:
echo $LINE | sed 's/\<\/.*$//' | sed 's/^\<//' | awk -F '>' '{print $1 "," $2}'

You may want to enforce quotes around the second item, because I'm not sure that all of your data is quote-sanitized.

EDIT: Forgot to end the line on first sed.

Last edited by KickstartUF; 03-01-2012 at 01:12 PM.. Reason: Code tags
# 5  
Old 03-01-2012
If the whitespace of the XML is significantly different than you've shown -- everything jammed on one line, etc -- then anything we write won't work. awk and sed are not general-purpose XML tools, that needs a recursive parser.

Code:
$ cat jobxml.awk

BEGIN { FS="[<>]";      OFS=","         }
# Single close-tag
(NF==3) && /^[ \t]*<[/]/        {

        $0=""

        if(!TITLE)      # Print a title line
        {
                for(N=1; N<=L; N++)     $N=T[N]
                print
                TITLE=1
        }

        for(N=1; N<=L; N++)     {       $N=A[T[N]];     delete A[T[N]]  }
        print
}

$2 && $4 && ($2 == substr($4, 2)) {
        if(!T[$2]) { T[$2]=++L; T[L]=$2 }       # Save titles for later
        gsub(/^[ \t]*/, "", $3);                # Get rid of spaces in data
        gsub(/[ \t]*$/, "", $3);
        A[$2]=$3                                # Save for later
}

$ awk -f jobxml.awk data

DataDate,StartTime,JobTypeCode,StatusCode,JobParameterCode,JobParameterValue
2011-06-01,2011-11-25T00:03:28,12002,0,103,201

$

should be able to handle multiple <JobStatus> entries in a row.
# 6  
Old 03-01-2012
Hi sbk,

One way using perl:
Code:
$ cat infile
<JobStatus>
   <DataDate> 2011-06-01 </DataDate> 
   <StartTime>2011-11-25T00:03:28</StartTime> 
   <JobTypeCode>12002</JobTypeCode> 
   <StatusCode>0</StatusCode> 
   <JobParameterCode>103</JobParameterCode> 
   <JobParameterValue>201</JobParameterValue> 
</JobStatus>
$ cat script.pl
use warnings;
use strict;
use XML::Twig;

die qq[Usage: perl $0 <input-file>\n] unless @ARGV == 1;

die qq[ERROR: File must be plain and readable\n] unless -f $ARGV[0] && -r $ARGV[0];

my ($t, $root, @header, @data);

$t = XML::Twig->new();

$t->parsefile( shift @ARGV );
$root = $t->root;

for my $child ( $root->children ) {
        push @header, $child->gi;
        push @data, $child->text;
}

printf qq[%s\n], join qq[,], @header;
printf qq[%s\n], join qq[,], @data;

exit 0;
$ perl script.pl infile
DataDate,StartTime,JobTypeCode,StatusCode,JobParameterCode,JobParameterValue
 2011-06-01 ,2011-11-25T00:03:28,12002,0,103,201

Regards,
Birei
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

What does the sed command here exactly doing to csv file?

I have the following csv file at the path; now using sed command. what is this exactly doing. sed -i 's//,/g' /FTP/LAB_RESULT_CM.csv Thank you very much for the helpful info. (2 Replies)
Discussion started by: cplusplus1
2 Replies

2. Shell Programming and Scripting

Convert XML to CSV using awk or shell script

Hello, I am working on a part of code where I need a awk or shell script to convert the given XML file to CSV or TXT file. There are multiple xml files and of different structure, so a single script is required for converting data. I did find a lot of solutions in the forum but... (16 Replies)
Discussion started by: Rashmitha
16 Replies

3. Shell Programming and Scripting

Modify xml using sed or awk

Hi All, I want to modify(changing the status from "on" to "off" status of Stage-element value from the below xml file using sed or awk: File Name: global.xml <?xml version="1.0" encoding="UTF-8"?> <config> <widget> <name>HTTP-POOL</name> <attributes> ... (5 Replies)
Discussion started by: wamqemail2
5 Replies

4. Shell Programming and Scripting

Multiple command execution inside awk command during xml parsing

below is the output xml string from some other command and i will be parsing it using awk cat /tmp/alerts.xml <Alert id="10102" name="APP-DS-ds_ha-140018-componentFailure-S" alertDefinitionId="13982" resourceId="11427" ctime="1359453507621" fixed="false" reason="If Event/Log Level(ANY) and... (2 Replies)
Discussion started by: vivek d r
2 Replies

5. Shell Programming and Scripting

awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command. can some help? Sample looks like below <root> <Main> ... (6 Replies)
Discussion started by: bk_12345
6 Replies

6. Shell Programming and Scripting

XML- Sed || Awk Bash script... Help!

Hi ! I'm working into my first bash script to make some xml modification and it's going to make me crazy lol .. so I decide to try into this forum to take some ideas from people that really know about this! This is my situation I've and xml file with a lots of positional values with another tags... (9 Replies)
Discussion started by: juampal
9 Replies

7. Shell Programming and Scripting

Need help in using sed/awk for line insertion in xml

Hello, I have two text files (txt1 and txt2). txt1 contains many lines with a single number in each line. txt2 (xml format) contains information about the numbers given in txt1. I need to insert one line in txt2 within the scope of each number taken from txt1. Sample problem: txt1: 12 23... (1 Reply)
Discussion started by: shekhar2010us
1 Replies

8. Shell Programming and Scripting

awk convert xml to csv

Hi, I have an xml file and I want to convert it with awk in to a csv file Test.xml <Worksheet ss:Name="Map1"> <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="2" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="60"> <Row> <Cell><Data... (6 Replies)
Discussion started by: research3
6 Replies

9. Shell Programming and Scripting

awk/sed/something else for csv file

Hi, I have a filename.csv in which there are 3 colums, ie: Name ; prefixnumber ; number root ; 020 ; 1234567 user1,2,3 ; 070 ; 7654321 What I want is to merge colum 2 and 3 that it becomes 0201234567 or even better +31201234567 so the country number is used and drop the leading 0.... (9 Replies)
Discussion started by: necron
9 Replies

10. Shell Programming and Scripting

parsing xml with awk/sed

Hi people!, I need extract from the file (test-file.txt) the values between <context> and </context> tag's , the total are 7 lines,but i can only get 5 or 2 lines!!:confused: Please look my code: #awk '/context/{flag=1} /\/context/{flag=0} !/context/{ if (flag==1) p rint $0; }'... (3 Replies)
Discussion started by: ricgamch
3 Replies
Login or Register to Ask a Question