Help with shell script to extract data from XML file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with shell script to extract data from XML file
# 1  
Old 09-03-2008
Help with shell script to extract data from XML file

Hello Scripting Gurus,

I need help with extracting data from the XML file using shell script.
The data is in a large XML and I need to extract the id values of all completedworkflows. Here is a sample of it. Input and output data is also in the attached text files.

<wfregistry>
<completedworkflows>
<id v="3381"/>
<id v="3399"/>
<id v="3415"/>
<id v="3431"/>
<id v="3447"/>
<id v="3463"/>
<id v="3479"/>
<id v="3495"/>
<id v="3511"/>
<id v="3527"/>
<id v="3543"/>
<id v="3559"/>
<id v="3575"/>
<id v="3591"/>
<id v="3607"/>
</completedworkflows>
<completedtasks>
<id v="3383"/>
<id v="3389"/>
<id v="3390"/>
<id v="3401"/>
<id v="3407"/>
<id v="3408"/>
<id v="3417"/>
<id v="3423"/>
<id v="3424"/>
<id v="3433"/>
<id v="3439"/>
<id v="3440"/>
<id v="3449"/>
<id v="3455"/>
<id v="3456"/>
<id v="3465"/>
<id v="3471"/>
</completedtasks>
</wfregistry>

The output has to be list of all completed tasks.i.e :

3381
3399
3415
3431
3447
3463
3479
3495
3511
3527
3543
3559
3575
3591
3607

Your help is highly appreciated.

Thank you,
Ajay.
# 2  
Old 09-03-2008
You have to print the second field considering the field separator as ( " )

try this....
nawk 'BEGIN{ FS=" " "};/<id v=/{print $2 }' input_file

Im not sure about this " { FS=" " "} " ... make sure to well define the field separator...


Regards
# 3  
Old 09-03-2008
Rather than using grep, sed or awk to transform the XML into the required output, it is better to use an XSLT processor.

If you have (Gnome) libxsl/libxslt installed, it comes with xsltproc a command line interface to a XSLT v1.0 compliant processor.

Here is a stylesheet which performs the required transformation using xsltproc:
Code:
$ cat file.xsl
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" omit-xml-declaration="yes" />

<xsl:template match="id">
    <xsl:value-of select="@v"/><xsl:text>&#xa;</xsl:text>
</xsl:template>

<xsl:template match="/">
  <xsl:apply-templates select="/wfregistry/completedworkflows/id" />
</xsl:template>

</xsl:stylesheet>

$ xsltproc file.xsl file.xml
3381
3399
3415
3431
3447
3463
3479
3495
3511
3527
3543
3559
3575
3591
3607
$

# 4  
Old 09-04-2008
use this cmd it work quikly:
awk -F "\"" '/id/ {print $2}' test
# 5  
Old 09-04-2008
Another one :

Code:
awk '/id v=/ { print }' filename  | sed 's!<id v=\"\(.*\)\"/>!\1!'

# 6  
Old 09-04-2008
Your request is not very clear Smilie
Quote:
Originally Posted by yajaykumar
The data is in a large XML and I need to extract the id values of all completedworkflows.
........

The output has to be list of all completed tasks.i.e :
....
Base on your required output this should work for you.
Code:
awk -F'"' '/complet/ &&! f{f=1;next}/complet/ && f{exit} f{print $2}'  file


Last edited by danmero; 09-04-2008 at 08:27 AM.. Reason: fix typo
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need get data from XML file through shell script..

hi all, here is the sample log file and these errors are repeated in log file.. i need all the repeated time stamp ,severity and message tags needs to print in output file.. through shell script <log-message> <timestamp>2019-03-13T04:52:49.648-05:00</timestamp> <severity>ERROR</severity>... (17 Replies)
Discussion started by: ravi
17 Replies

2. UNIX for Beginners Questions & Answers

Shell script to extract data in a file

I have this 5GB file, and i want to extract from the file particulars pattern. this is my script: // count=`grep -wc "MSISDN" file_name` k=1 >OUTPUT >OUTPUT_Final while do cat file_name | awk -F":" -v var="$k" '$1=="MSISDN" {m++}m==var{print; exit}' >> OUTPUT cat file_name |awk -F":"... (33 Replies)
Discussion started by: gillesi
33 Replies

3. Shell Programming and Scripting

Shell script to extract data from csv file

Hi everyone, I have a csv file which has data with different heading and column names as below. Static Data Ingested ,,,,,,,,,,,,Known Explained Rejections Column_1,column_2,Column_3,Column_4,,Column_6,Column_7,,% Column_8,,Column_9 ,Column_10 ,... (14 Replies)
Discussion started by: Vivekit82
14 Replies

4. UNIX for Dummies Questions & Answers

Shell script to extract data from csv file

Hi Guys, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 7 columns having values say column 1,column 2.....column 7 as below along with their values. Name, Address,... (7 Replies)
Discussion started by: Vivekit82
7 Replies

5. Shell Programming and Scripting

How to extract data from XML file using shell scripting?

Hi , I have input file as XML. following are input data #complex.xml Code: <?xml version="1.0" encoding="UTF-8"?><TEST_doc xmlns="http://www.w3.org/2001/XMLSchema-instance"> <ENTRY uid="123456"> <protein> <name>PROT001</name> <organism>Human</organism> ... (1 Reply)
Discussion started by: arun_kohan
1 Replies

6. Shell Programming and Scripting

How to extract data from xml file using shell scripting?

Hi evry1, This is my 1st post in this forum.Pls help me I want to extract some data froma xml file which has 2000 lines using shell scripting. Actually my xml file has some "audio and video codes" which i need to arrange in a column wise format after extracting it using shell scripting.I... (4 Replies)
Discussion started by: arun_kohan
4 Replies

7. Shell Programming and Scripting

Shell script to extract data in repeating tags from xml

Hi, I am new to shell scripting. I need to extract data between repeating tags from an xml file and store the data in an array to process it further. <ns1:root xmlns:ns1="http://example.com/config"> <ns1:interface>in1</ns1:interface> <ns1:operation attribute1="true" attribute2="abd"... (2 Replies)
Discussion started by: sailendra
2 Replies

8. Shell Programming and Scripting

need a shell script to extract data from a log file.

If I have a log like : Mon Jul 19 05:07:34 2010; TCP; eth3; 52 bytes; from abc to def Mon Jul 19 05:07:35 2010; UDP; eth3; 46 bytes; from aaa to bbb Mon Jul 19 05:07:35 2010; TCP; eth3; 52 bytes; from def to ghi I will need an output like this : Time abc to def... (1 Reply)
Discussion started by: hitha87
1 Replies

9. Shell Programming and Scripting

Convert XML to Data File in Shell Script

Hi All, I will be getting a huge XML file with a lot of records in it. I need to convert it into multiple data files. SAMPLE XML FILE <ABSProductCatalog xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> - <ProductSalesHierachy> - <Portfolios> - <Portfolio productCode="P1"> ... (8 Replies)
Discussion started by: ragha81
8 Replies

10. Shell Programming and Scripting

extract data from xml- shell script using awk

Hi, This is the xml file that i have. - <front-servlet platform="WAS4.0" request-retriever="SiteMinder-aware" configuration-rescan-interval="60000"> <concurrency-throttle maximum-concurrency="50" redirect-page="/jsp/defaulterror.jsp" /> - <loggers> <instrumentation... (5 Replies)
Discussion started by: nishana
5 Replies
Login or Register to Ask a Question