Converting XML to CSV


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Converting XML to CSV

Hello,


For i while i have been using XMLStarlet to convert several XML files to CSV files. So far this always went fine.


Today however i got a new XML format however but i cannot find out how to get the data i need.


Below is part of the code where it shows the different format. What i expect is the following:


Code:
idpurchaseorder;idsupplier;purchaseorderid;idproduct;productcode;price;amount;amountreceived

Code:
<searchresults>
  <success>true</success>
  <rate-limit-remaining>498</rate-limit-remaining>
  <data>
    <idpurchaseorder>373251</idpurchaseorder>
    <idsupplier>42919</idsupplier>
    <idwarehouse>4863</idwarehouse>
    <idtemplate>3994</idtemplate>
    <purchaseorderid>PO2019-1008</purchaseorderid>
    <supplier_orderid>09-09-2019</supplier_orderid>
    <supplier_name></supplier_name>
    <status>purchased</status>
    <remarks></remarks>
    <delivery_date>2019-09-14</delivery_date>
    <language>nl</language>
    <purchased_by_iduser>6407</purchased_by_iduser>
    <purchased_at>2019-09-10 09:31:37</purchased_at>
    <completed_by_iduser></completed_by_iduser>
    <completed_at></completed_at>
    <created_by_iduser>6414</created_by_iduser>
    <created>2019-09-09 12:41:44</created>
    <updated>2019-09-10 09:31:37</updated>
    <products>
      <idpurchaseorder_product>5275880</idpurchaseorder_product>
      <idproduct>10307107</idproduct>
      <idvatgroup>9520</idvatgroup>
      <productcode>P00019664</productcode>
      <productcode_supplier>0389450893456</productcode_supplier>
      <name>Unold 78856</name>
      <price>99</price>
      <amount>1</amount>
      <amountreceived>0</amountreceived>
      <delivery_date></delivery_date>
      <weight>0</weight>
    </products>
  </data>
  <data>
    <idpurchaseorder>373062</idpurchaseorder>
    <idsupplier>42561</idsupplier>
    <idwarehouse>4863</idwarehouse>
    <idtemplate>3994</idtemplate>
    <purchaseorderid>PO2019-1007</purchaseorderid>
    <supplier_orderid>09-09-2019</supplier_orderid>
    <supplier_name></supplier_name>
    <status>purchased</status>
    <remarks></remarks>
    <delivery_date>2019-09-13</delivery_date>
    <language>en</language>
    <purchased_by_iduser>6414</purchased_by_iduser>
    <purchased_at>2019-09-09 10:46:22</purchased_at>
    <completed_by_iduser></completed_by_iduser>
    <completed_at></completed_at>
    <created_by_iduser>6414</created_by_iduser>
    <created>2019-09-09 10:46:18</created>
    <updated>2019-09-09 10:46:22</updated>
    <products>
      <idpurchaseorder_product>5272744</idpurchaseorder_product>
      <idproduct>10304708</idproduct>
      <idvatgroup>9520</idvatgroup>
      <productcode>P00172393</productcode>
      <productcode_supplier>34556WDR09</productcode_supplier>
      <name>Product 2</name>
      <price>999</price>
      <amount>1</amount>
      <amountreceived>0</amountreceived>
      <delivery_date></delivery_date>
      <weight>0</weight>
    </products>
    <products>
      <idpurchaseorder_product>5272745</idpurchaseorder_product>
      <idproduct>10304135</idproduct>
      <idvatgroup>9520</idvatgroup>
      <productcode>P00020201</productcode>
      <productcode_supplier>345345074</productcode_supplier>
      <name>Productname</name>
      <price>99</price>
      <amount>1</amount>
      <amountreceived>0</amountreceived>
      <delivery_date></delivery_date>
      <weight>0</weight>
    </products>
  </data>
 </searchresults>

What i tried are the following with the results of (part of all the needed columns to keep it easier to read) the above piece.


Code:
xmlstarlet sel -t -m //searchresults/data -v "concat(idpurchaseorder,';',products/idproduct)" -n purchaseorders.xml

this above gave me:
Code:
373251;10307107
373062;10304708 <-- here should be a second line with the second column different id

then i tried the following line:


Code:
xmlstarlet sel -t -m //searchresults/data -v idpurchaseorder -o " ; " -v products/idproduct -n purchaseorders.xml

and this gave me the output below

Code:
373251 ; 10307107
373062 ; 10304708
10304135 <-- here is the idpurchaseorder missing from the first column but it does have both products listed

What i expect would be the following result from the testdata (i can add the other needed data myself if needed) above:

Code:
373251;10307107
373062;10304708
373062;10304135

Does anyone know the correct way of getting the data requested? If it is a other way then XMLStarlet it is fine although that is the easiest so far for me.

Last edited by SDohmen; 5 Days Ago at 03:33 PM..
These 2 Users Gave Thanks to SDohmen For This Post:
# 2  
Hi SDohmen,

some additional bits of Knowledge about XPath will lead you forward towards your solution. Especially try to get information about how to figure out "parent nodes" of your target nodes.

One possible algorithm is to not iterate over the different orders(data), but iterate over the different products(products) and get a parent node(idpurchaseorder) from there.

Regards,
stomp.

P. S.: @Mod or @SDohmen: please fix the typo in the xml above: <name>Productname/name>

P. P. S.: I did not know the solution before the question was asked, and - according to Neos request - I will not take the fun and chance to grow from the poster to figuring it out for him-/herself. (The solution is about the same in size as the current shown attempts).

P. P. P. S: +1 Point for SDohmen using a suitable tool for the task. ;-)

Last edited by stomp; 5 Days Ago at 03:48 PM..
These 2 Users Gave Thanks to stomp For This Post:
# 3  
Quote:
Originally Posted by stomp
Hi SDohmen,

some additional bits of Knowledge about XPath will lead you forward towards your solution. Especially try to get information about how to figure out "parent nodes" of your target nodes.

One possible algorithm is to not iterate over the different orders(idpurchaseorder), but iterate over the different products(idproduct) and get the parent node(idpurchaseorder) from there.

Regards,
stomp.

P. S.: @Mod or @SDohmen: please fix the typo in the xml above: <name>Productname/name>

P. P. S.: I did not know the solution before the question was asked, and - according to Neos request - I will not take the fun and potential for growth from the poster to figuring it out for his-/herself. (The solution is about the same in size as the current shown attempts).

P. P. P. S: +1 Point for SDohmen using the right tool for the right task. ;-)

Thank you for the comment. I edited the text to have the xml correctly displayed.


I think i know what you mean by selecting the subnodes first but i have no clue how to get the rest filled in.



When i change the line to the one below:
Code:
xmlstarlet sel -t -m "/searchresults/data" -v "products/idproduct" -o " ; " -v idpurchaseorder -n purchaseorders.xml

i am getting the text as above except switched around so that was not what you mean i guess.


Do you have any more clues what to look for?




i did search around some more and found the link below:
Code:
https://stackoverflow.com/questions/53550425/extracting-xml-child-node-values-based-on-a-parent

This looks a bit like it but i am not sure if that is the correct direction.
# 4  
What did I do to find the solution?
  • Read some Tutorials about XPath
  • Reached out for xmlstarlet documentation
  • Used google a lot on the subject, learned from many examples(stackoverflow delivers many of those)
  • experimented with what I found, began sometimes with most simple things to understand small bits and rose higher to the full complexity of whole task

Last edited by stomp; 4 Days Ago at 03:16 AM..
This User Gave Thanks to stomp For This Post:
# 5  
A little help for what you've nearly got yourself:

Code:
xmlstarlet sel -t -m //searchresults/data/products -v "concat(';',idproduct)" -n purchaseorders.xml

This will give you the ordered products, like this:

Code:
;10307107
;10304708
;10304135

Now you just have to fill in the XPath-Expression for the Parent node - the ordernumber - into the concat statement...

I'm reading a bit about XPath now. It's really interesting, what is possible with that:

XPath Syntax

XPath Axes

Last edited by stomp; 3 Days Ago at 06:19 PM..
# 6  
SDohmen, I don't generally thank people for posting questions.

However you've posted good example data and a clear and concise question with a reasonable attempt at solving it yourself. Also the tool your using is the right one for the job, so many XML parsing questions posted here want to use sed / grep / awk which will usually fail given a different XML layout.

Interesting to note that if you start from the parent level, which would have been my first inclination, you only get the first product under that parent:

Code:
xmlstarlet sel -t -m //searchresults/data -v "concat(';',products/idproduct)" -n purchaseorders.xml
;10307107
;10304708

Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk for converting xml to txt

Hi, I have a xml script, I converted it to .txt with values comma seperated using awk function. But I want the output values should be inside double quotes My xml script (Workorders.xml) is shown like below: <?xml version="1.0" encoding="utf-8" ?> <scbm-extract version="3.3">... (8 Replies)
Discussion started by: Viswanatheee55
8 Replies

2. UNIX for Dummies Questions & Answers

Need help converting txt to XML

I have a table as following Archive id Line Author Time Text 1fjj34 3 75jk5l 03:20 this is an evidence regarding ... 1fjj34 4 gjhhtrd 03:21 we have seen those documents before 1fjj34 10 645jmdvvb 04:00 Will you consider such an offer?... (0 Replies)
Discussion started by: A-V
0 Replies

3. Shell Programming and Scripting

Help with converting XML to Flat file

Hi Friends, I want to convert a XML file to flat file. Sample I/p: <?xml version='1.0' encoding='UTF-8' ?> <DataFile xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' contactCount='4999' date='2012-04-14' time='22:00:14' xsi:noNamespaceSchemaLocation='gen .xsd'> <Contact... (3 Replies)
Discussion started by: karumudi7
3 Replies

4. Shell Programming and Scripting

converting specific XML file to CSV

Hi, i would convert the following XML file : <?xml version="1.0" encoding="UTF-8" ?> <files xmlns="http://www.lotus.com/dxl/console"> <filedata notesversion="6" odsversion="43" logged="yes" backup="no" id="C12577E6:004B0DA3" iid="C12577E6:004B0DA8" link="1" dboptions="0,524288,0,0"> ... (24 Replies)
Discussion started by: Nicol
24 Replies

5. Shell Programming and Scripting

Converting a flat file in XML

Hello Friends, I am new to UNIX shell scripting. Using bash....Could you please help me in converting a flat file into an XML style output file. Flat file: (Input File entries looks like this) John Miller: 617-569-7996:15 Bunting lane, staten Island, NY: 10/21/79: 60600 The... (4 Replies)
Discussion started by: humkhn
4 Replies

6. UNIX for Dummies Questions & Answers

Converting spaces to csv

I'm new to shell scripting and I have a file with positional columns. The rows look like this: 222 3333 44444 55 55 55 55 5555 59999 222 3333 44444 55 55 55 55 5555 59999 Not every row has the same number of spaces between the seven columns. They can range anywhere... (11 Replies)
Discussion started by: jkandel
11 Replies

7. UNIX for Dummies Questions & Answers

Converting csv to tsv and back

Anyone have a simple UNIX script to look at for doing this? I'm interested in how to convert from csv to tsv, and then back again, if I want to. Thanks! (13 Replies)
Discussion started by: doubleminus
13 Replies

8. UNIX for Dummies Questions & Answers

Converting HTML to CSV

Hi, I need to convert a relatively large html file (1.5megs) into CSV under Unix. How would I be able to do this? Much thanks. (3 Replies)
Discussion started by: Jexel
3 Replies

9. Shell Programming and Scripting

converting text to csv format

I am trying to check each line and based on first two digits, the comma needs to be place. I checked in the earlier post where the text is converted to csv with a tab delimited. Here is the test file that needs to be changed to csv 11 051701 22 051701 330123405170105170112345... (13 Replies)
Discussion started by: gthokala
13 Replies

10. Shell Programming and Scripting

Converting csv to xls

Hi, Can anyone tell the option to change the file type in unix. i.e. if a file is in csv(Comma Separating Values) format, it should be changed to xls(ordinary MS-Excel) format. But renaming command is not changing to correct file format. Thanks in advance, Milton. (1 Reply)
Discussion started by: miltony
1 Replies

Featured Tech Videos