Xml to csv

Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 8  
Hi.

Use this version, s2:
Code:
#!/usr/bin/env bash

# @(#) s2       Demonstrate string extraction from XML file, xml2.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
# C=$HOME/bin/context && [ -f $C ] && $C specimen xml2 grep awk tr dixf

FILE=${1-data1}
# E=expected-output.txt

pl " Sampled lines from data file $FILE:"
# specimen -n $FILE
head $FILE

# Look for a001, b203, and j151
pl " Results, warning message expected:"
xml2 < $FILE |
tee f1 |
grep -E '(a001|b203|j151)=' |
tee f2 |
awk -F/ '{print $NF}'|
tee f3 |
awk -F= '{print $2}'|
tee f4 |
tr '\n' '\t' ; echo ""

rm -f f?
exit 0

producing, by using your file name for data1 here:
Code:
$ ./s2 data1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.7 (jessie) 
bash GNU bash 4.3.30
specimen (local) 1.17
xml2 - ( /usr/bin/xml2, 2012-04-16 )
grep (GNU grep) 2.20
awk GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)
tr (GNU coreutils) 8.23
dixf (local) 1.42

-----
 Sampled lines from data file data1:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE ONIXmessage SYSTEM "http://www.editeur.org/onix/2.1/short/onix-international.dtd" >
<ONIXmessage release="2.1">
<header><m174>Houghton Mifflin</m174><m175>Catherine Toolan 978-465-7755</m175><m283>eloquence@firebrandtech.com</m283><m182>20170201</m182><m183>Title information from Houghton Mifflin</m183><m184>eng</m184><m185>01</m185><m186>USD</m186><m187>in</m187><m193>General Trade</m193></header>
  <product>
    <a001>9781328740472</a001>
    <a002>02</a002>
    <a197>HMH</a197>
    <productidentifier>
      <b221>02</b221>

-----
 Results, warning message expected:
error: Extra content at the end of the document
9781328740472   Peepers 7.99    10.99

Best wishes ... cheers, drl

PS:
It looks like brew,fink,port have some version of xml2 for an old system like:
Code:
OS, ker|rel, machine: Apple/BSD, Darwin 9.8.0, Power Macintosh
Distribution        : Mac OS X 10.5.8 (leopard, workstation)


Last edited by drl; 03-13-2017 at 06:37 PM.. Reason: Add stuff foir Mac
# 9  
drysdalk - It will be a weekly task, and there are about 2.5M lines per file, so Excel would not be practical, unfortunately. Thanks, though!

---------- Post updated 03-14-17 at 07:13 AM ---------- Previous update was 03-13-17 at 09:10 PM ----------

dlr - Thanks again. Yes, I still have that issue with xml2 (line 24). Also, would there be an issue with commenting out

Code:
pl() { pe;pe "-----" ;pe "$*"; }

to eliminate the line break?

Code:
-----
 Sampled lines from data file zzz:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE ONIXmessage SYSTEM "http://www.editeur.org/onix/2.1/short/onix-international.dtd" >
<ONIXmessage release="2.1">
<header><m174>Houghton Mifflin</m174><m175>Catherine Toolan 978-465-7755</m175><m283>eloquence@firebrandtech.com</m283><m182>20170201</m182><m183>Title information from Houghton Mifflin</m183><m184>eng</m184><m185>01</m185><m186>USD</m186><m187>in</m187><m193>General Trade</m193></header>
  <product>
    <a001>9781328740472</a001>
    <a002>02</a002>
    <a197>HMH</a197>
    <productidentifier>
      <b221>02</b221>

-----
 Results, warning message expected:
./z: line 24: xml2: command not found

# 10  
As you didn't specify any restrictions on neither input (e.g. pattern repetitions) nor output structure (field ordering, multiple lines), this easy approach might be of some interest:
Code:
awk '/a001|b203|j151/ {gsub (/ *<[^>]*> */, _); printf "%s\t", $0} END {printf RS}' file
9781328740472    Peepers    7.99    10.99

# 11  
RudiC - Yes, thanks so much, that is essentially what I need. The output line would be one of many with the same structure, so I think I need a
Code:
\n

somewhere, but couldn't quite get it to work.
# 12  
Where do you need the <line feed>? If after the j151, be aware that there can be several in one record. If you can be sure there's just one, try
Code:
awk '/a001|b203|j151/ {TRS=/j151/?RS:"\t"; gsub (/ *<[^>]*> */, _); printf "%s%s", $0, TRS}' file
9781328740472    Peepers    7.99

# 13  
Hi, palex.

You need to have xml2 in your system. As I wrote, it is available for installing in at least the version of MacOS that I have, albeit from 3rd parties.

If the solution from RudiC works for you, then use it -- it is simpler than xml2.

Best wishes ... cheers, drl
This User Gave Thanks to drl For This Post:
# 14  
Very close... The following is the first four lines of output when I run the command on the entire data file:

Code:
9781328740472   Peepers 7.99
10.99
9780544503205   Curious George Fire Dog Rescue (CGTV reader)    3.99
5.99
9780544574786   Mistakes Were Made (but Not by Me)      15.95
22.50
9781328683786   Tools of Titans 28.00
40.00

I'm not sure where that extra field is coming from. Desired output:

Code:
9781328740472   Peepers 7.99
9780544503205   Curious George Fire Dog Rescue (CGTV reader)    3.99
9780544574786   Mistakes Were Made (but Not by Me)      15.95
9781328683786   Tools of Titans 28.00

Sorry if I was unclear! Thanks again!
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Xml to csv (again)

Hello, I have copied .xml code for a single item below. I am trying to extract three items (field indices*b244 (second occurrence), b203, and j151), so the desired output would be: 9780323013543 Manual of Natural Veterinary Medicine: Science and Tradition, 1e 68.95 A parallel solution,... (14 Replies)
Discussion started by: palex
14 Replies

2. Shell Programming and Scripting

XML to CSV

I want to pharse below Xml Using Shell Scripting . Thanks in Advance <md> <neid> <neun>1523</neun> <nedn>XXX1212</nedn> <nesw>fffff12515</nesw> </neid> <mi> <mts>20141128001500</mts> <gp>550</gp> <mt>pmct1</mt> <mt>pmNo2</mt> <mt>pmNo3S</mt> <mv> <moid>Ma=1,Rn=1,Ul=311C</moid>... (6 Replies)
Discussion started by: pareshkp
6 Replies

3. Shell Programming and Scripting

How to convert xml to csv ?

I am in need of converting billions of XML into csv file to load data to DB, i have found the below code in perl but not sure why it's not working properly. CODE: #!/usr/bin/perl # Script to illustrate how to parse a simple XML file # and pick out all the values for a specific element, in... (1 Reply)
Discussion started by: rspwilliam
1 Replies

4. UNIX for Dummies Questions & Answers

XML to TXT or CSV

Hi all, I am new to unix and even newer to XML :wall: I have a dataset which I need to work on and extract data from but I cant even see things. its a XML file which i need to analyse and return the results in xml as well but need to filter some of them like i would do with excel file so not... (7 Replies)
Discussion started by: A-V
7 Replies

5. Shell Programming and Scripting

Convert xml to csv

I need to convert below xml code to csv. I searched other posts as well but this post (_https://www.unix.com/shell-programming-scripting/174417-extract-parse-xml-data-statistic-value-csv.html) gives "sed command garbled" error. As of now I have written a long script to do it, but can it be done with... (7 Replies)
Discussion started by: dineshydv
7 Replies

6. Shell Programming and Scripting

XML to CSV specific

Hi , Please any one to help on ,extract this xml code into csv columns list. <SOURCEFIELD BUSINESSNAME ="" DATATYPE ="date" DESCRIPTION ="" FIELDNUMBER ="1" FIELDPROPERTY ="0" FIELDTYPE ="ELEMITEM" HIDDEN ="NO" KEYTYPE ="NOT A KEY" LENGTH ="19" LEVEL ="0" NAME ="BUSINESS_DATE"... (4 Replies)
Discussion started by: mohan705
4 Replies

7. Shell Programming and Scripting

XML to csv transformation

Hi, I want to write a perl script. Which should accept the xml file, one xsl file and the loaction. The perl script should process the xml file using the xsl file and puts the out put in specified location. For example: My.perl is perls cript. my.xml is like this <?xml version="1.0"... (2 Replies)
Discussion started by: siba.s.nayak
2 Replies

8. Shell Programming and Scripting

CSV processing to XML

Hi, i am really fresh with shell scripting and programming, i have an issue i am not able to solve to populate data on my server for Cisco IP phones. I have CSV file within the following format: ;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;... (9 Replies)
Discussion started by: angel2008
9 Replies

9. Shell Programming and Scripting

CSV to XML

Iam pretty new to UNIX and would like to convert a CSV to an XML file using AWK scripts. Can anybody suggest a solution? My CSV file looks something like this : Serial No Growth% Annual % Commission % Unemployed % 1 35% 29% 59% 42% 2 61% ... (15 Replies)
Discussion started by: pjanakir
15 Replies

10. Shell Programming and Scripting

Help to convert XML to CSV

Apologies if this has already been covered in this site somewhere, I did try looking but without any success. I am new to the whole XML thing, very late starter, and have a requirement to convert an XML fiule to a CSV fomat. I am crrently working on a Solaris OS. Does anyone have any suggestions,... (2 Replies)
Discussion started by: rossingi_33
2 Replies

Featured Tech Videos