Extract multiple xml tag value into CSV format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract multiple xml tag value into CSV format
# 1  
Old 04-01-2011
Extract multiple xml tag value into CSV format

Hi All,

Need your assistance on another xml tag related issue. I have a xml file as below:
HTML Code:
<INVOICES>
<INVOICE>
<BILL>
<BILL_NO>1234</BILL_NO>
<BILL_DATE>01 JAN 2011</BILL_DATE>
</BILL>
<NAMEINFO>
<NAME>ABC</NAME>
</NAMEINFO>
</INVOICE>
<INVOICE>
<BILL>
<BILL_NO>5678</BILL_NO>
<BILL_DATE>01 JAN 2011</BILL_DATE>
</BILL>
<NAMEINFO>
<NAME>BCA</NAME>
</NAMEINFO>
</INVOICE>
<INVOICE>
<BILL>
<BILL_NO>1256</BILL_NO>
<BILL_DATE>01 JAN 2011</BILL_DATE>
</BILL>
<NAMEINFO>
<NAME></NAME>
</NAMEINFO>
</INVOICE>
<INVOICE>
<BILL>
<BILL_NO>345</BILL_NO>
<BILL_DATE>01 JAN 2011</BILL_DATE>
</BILL>
<NAMEINFO>
<NAME/>
</NAMEINFO>
</INVOICE>
<INVOICE>
<BILL>
<BILL_NO>8934</BILL_NO>
<BILL_DATE>01 JAN 2011</BILL_DATE>
</BILL>
<NAMEINFO>
<NAME>PKL</NAME>
</NAMEINFO>
</INVOICE>
</INVOICES>
I need the CSV file in the following format

HTML Code:
1234.ABC
5678,BCA
1256,NA
345,NA
8934,PKL
The xml tag is not consistent for NAME. Is this achievebale ? Your help is highly appreciated.

Thanks
Angshuman

Last edited by angshuman; 04-01-2011 at 02:24 AM.. Reason: spelling mistake
# 2  
Old 04-01-2011
Code:
awk -F'>|<' '/BILL_NO/{printf $3}/NAME\>/{print NF==3?",NA":","$3}'

This User Gave Thanks to yinyuemi For This Post:
# 3  
Old 04-01-2011
Hi Yinyuemi,

I tried your code but I am not sure where do I need to put in the file name. Another point is that I am using it in HP Unix.

Thanks
Angshuman
# 4  
Old 04-01-2011
please try:

Code:
awk -F'>|<' '/BILL_NO/{printf $3}/NAME\>/{print NF==3?",NA":","$3}' urfile

# 5  
Old 04-01-2011
Hi Yinyuemi,

I tried and got the following error:

Code:
syntax error The source line is 1.
 The error context is
                /BILL_NO/{printf $3}/NAME\>/{print >>>  NF== <<<
 awk: The statement cannot be correctly parsed.
 The source line is 1.

# 6  
Old 04-01-2011
how about this?

Code:
awk -F'>|<' '/BILL_NO/{printf $3}/NAME\>/{if(NF==3) {print ",NA"} else{print ","$3}}' file

or:
Code:
awk -F'>|<' '/BILL_NO/{printf $3}/NAME\>/{if($2~/\//) {print ",NA"} else{print ","$3}}' file

# 7  
Old 04-01-2011
Hi Yinyuemi,

Both the modifed code is giving output without any error except the text NA is not appearing for two bills.

Code:
awk -F'>|<' '/BILL_NO/{printf $3}/NAME\>/{if(NF==3) {print ",NA"} else{print ","$3}}' myfile

The out put is :

Code:
1234,ABC
5678,BCA
1256,
3458934,PKL

The expected was:

Code:
1234,ABC
5678,BCA
1256,NA
345,NA
8934,PKL

If you notice the xml file, you will see that for bill number 1256 the name tag is "<NAME></NAME>" whereas for bill number 345 it is "<NAME/>"
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grepping multiple XML tag results from XML file.

I want to write a one line script that outputs the result of multiple xml tags from a XML file. For example I have a XML file which has below XML tags in the file: <EMAIL>***</EMAIL> <CUSTOMER_ID>****</CUSTOMER_ID> <BRANDID>***</BRANDID> Now I want to grep the values of all these specified... (1 Reply)
Discussion started by: shubh752
1 Replies

2. Shell Programming and Scripting

How to insert a CSV within xml element tag using Python?

Hi Team, I have a CSV file which I have to read through and needs to insert the content within an XML file using Python ONLY ( as most of the code base we have in python only). I managed to find the first part, missing how to insert to XML under "specific" tags. cat input.csv... (0 Replies)
Discussion started by: panyam
0 Replies

3. Shell Programming and Scripting

Convert tag based lines to xml format

Hi All, Can some one help me to convert this line of code to xml format. Thanks in advance, preethy. input: ... (2 Replies)
Discussion started by: preethy
2 Replies

4. Shell Programming and Scripting

Extract XML tag value from file

Hello, Hope you are doing fine. I have an log file which looks like as follows: Some junk text1 Date: Thu Mar 15 13:38:46 CDT 2012 DATA SENT SUCCESSFULL: Some jun text 2 Date: Thu Mar 15 13:38:46 CDT 2012 DATA SENT SUCCESSFULL: ... (3 Replies)
Discussion started by: srattani
3 Replies

5. Shell Programming and Scripting

Extract TAG name and XPATH from XML file via shellscript

Hi, Here is a sample xml file and expected output. I need to extract the element/tag name (not value) and xpath (sample output.txt). But the main problem is I put here one simple xml file where I can clearly see the number of elements, but in real time I have a xml file which have over 500... (18 Replies)
Discussion started by: BithunC
18 Replies

6. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

7. Shell Programming and Scripting

extract xml tag based on condition

Hi All, I have a large xml file of invoices. The file looks like below: <INVOICES> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>1234</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>2345</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME>... (9 Replies)
Discussion started by: angshuman
9 Replies

8. Shell Programming and Scripting

how to extract the info in the tag from a xml file

Hi All, Do anyone of you have any idea how to extract each<info> tag to each different file. I have 1000 raw files, which come in every 15 mins.( I am using bash) I have tried my script as below, but it took hours to finish, which is inefficiency. perl -n -e '/^<info>/ and open FH,">file".$n++;... (2 Replies)
Discussion started by: natalie23
2 Replies

9. UNIX for Dummies Questions & Answers

Unable to extract a tag from a very long XML message

Hi I have a log file which contain XML message. I want to extract the value between the tag : <businessEventId>13201330</businessEventId> i.e., 13201330. I tried the following commands but as the message is very long, unable to do it. Attached is the log file. Please provide inputs. --... (3 Replies)
Discussion started by: Sapna_Sai
3 Replies

10. Shell Programming and Scripting

getting multiple xml tag

sorry for the trouble...... i have this file that contains the following: 00:00:21 Queue key, Queue Name= 00:00:21 Sending Message :<EXGC-EXGU xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <trans_id>EXGC</trans_id> <sys_prefix>GSYS</sys_prefix> ... (3 Replies)
Discussion started by: forevercalz
3 Replies
Login or Register to Ask a Question