data extraction from xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting data extraction from xml file
# 1  
Old 08-26-2010
Question data extraction from xml file

I have an of xml file as shown below
Code:
<?xml version='1.0' encoding='ASCII' standalone='yes' ?>
<Station Index="10264" >
   <Number Value="237895890" />
   <Position Lat="-29.5" Lon="3.5" />
   <MaxDepth Value="-4939" />
   <VeloLines Count="24">
      <VeloLine Index="0" >
         <Depth Value="0" />
         <Temperature Count="12" >
            2183
            2200
            2253
            2135
            2028
            1859
            1831
            1751
            1740
            1762
            1869
            1996
         </Temperature>
         <Salinity Count="12" >
            3577
            3586
            3583
            3582
            3575
            3580
            3576
            3575
            3566
            3567
            3561
            3569
         </Salinity>
      </VeloLine>
      <VeloLine Index="1" >
         <Depth Value="10" />
         <Temperature Count="12" >
            2155
            2188
            2254
            2128
            2020
            1854
            1810
            1739
            1732
            1749
            1850
            1964
         </Temperature>
         <Salinity Count="12" >
            3576
            3583
            3573
            3581
            3575
            3580
            3575
            3574
            3567
            3567
            3562
            3577
         </Salinity>

The temp gives me the temp for 12 months and salinity for 12 months. the file runs till 752 lines

Question : I want to extract some specific data (depth ,Temp and salinity) from the file i.e temp of 1st month (i.e first value in temp), corresponding salinity in tht month(i.e first value in salinity)

Code:
depth     temp    salinity
    0          2183     3577
    10        xxxxxx   xxxxx
    20        yyyyyy   yyyyyy
etc

to a file

My idea : Temp :- is starting from Line 10 for depth 0 and from line 41 for depth to and so on, so 31 lines diff bw them. and similarly for Salinity and similarly for depth - using awk and for loop for data extraction. Will this work ?. or any other ideas to implement this

I have attached the xml file for reference
# 2  
Old 08-26-2010
Code:
awk '
BEGIN{print "depth\ttemp\tsalinity"}
/<Depth/ {split($2,a,"\"");depth=a[2]}
/<Temperature/ {getline;temp=$1} 
/<Salinity/ {getline;salinity=$1;printf "%d\t%d\t%d\n",depth,temp,salinity}
' lat_060_lon_003.xml

depth   temp    salinity
0       2183    3577
10      2155    3576
20      2132    3575
30      2103    3574
50      1942    3574
75      1755    3567
100     1671    3561
125     1622    3558
150     1584    3555
200     1478    3543
250     1373    3530
300     1288    3520
400     1109    3498
500     920     3475
600     735     3457
700     576     3443
800     491     3435
900     425     3435
1000    384     3438
1100    350     3444
1200    322     3450
1300    311     3457
1400    303     3462
1500    301     3467

This User Gave Thanks to rdcwayx For This Post:
# 3  
Old 08-26-2010
The program works perfectly fine, I have one more query
if i want to access the first month this works fine, in case i want to get the 2nd or third month then (second or thirs number in Temp ot salinity colum) then is there a way that i can specify an variable along with "getline" which i can change so tht i can decide on the month which i want.

Right now i am using getline 2, 3 or n times to shift the months
awk '
Code:
BEGIN{print "depth\ttemp\tsalinity"}
/<Depth/ {split($2,a,"\"");depth=a[2]}
/<Temperature/ {getline;getline;temp=$1} 
/<Salinity/ {getline;getline;salinity=$1;printf "%s\t%d\t%d\n",depth,temp,salinity}
' lat_060_lon_003.xml

is it possible to assin some variable to control how much line it shud jump everytime.

Last edited by shashi792; 08-26-2010 at 12:16 PM..
# 4  
Old 08-26-2010
Something like this,
you can assign month to variable and loop for getline.
Code:
awk -v mon=3 '
BEGIN{print "depth\ttemp\tsalinity"}
/<Depth/ {split($2,a,"\"");depth=a[2]}
/<Temperature/ {for (i=1;i<=mon;i++) {getline};temp=$1}
/<Salinity/ {for (i=1;i<=mon;i++) {getline}salinity=$1;printf "%d\t%d\t%d\n",depth,temp,salinity}
' lat_060_lon_003.xml

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data extraction and converting into .csv file.

Hi All, I have a data file and need to extract and convert it into csv format: 1) Read and extract the line containing string ending with "----" (file sample_linebyline.txt file) and to make a .csv file from this. 2) To read the flat file flatfile_sample.txt which consists of similar data (... (9 Replies)
Discussion started by: abhi_123
9 Replies

2. Shell Programming and Scripting

Help with tag value extraction from xml file based on a matching condition

Hi , I have a situation where I need to search an xml file for the presence of a tag <FollowOnFrom> and also , presence of partial part of the following tag <ContractRequest _LoadId and if these 2 exist ,then extract the value from the following tag <_LocalId> which is "CW2094139". There... (2 Replies)
Discussion started by: paul1234
2 Replies

3. Shell Programming and Scripting

Help with XML tag value extraction based on condition

sample xml file part <?xml version="1.0" encoding="UTF-8"?><ContractWorkspace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _LoadId="export_AJ6iAFmh+pQHq1" xsi:noNamespaceSchemaLocation="ContractWorkspace.xsd"> <_LocalId>CW2218471</_LocalId> <Active>true</Active> ... (3 Replies)
Discussion started by: paul1234
3 Replies

4. Shell Programming and Scripting

Data extraction from .xml file

Hello, I'm attempting to extract 13 digit numbers beginning with 978 from a data file with the following command: awk '{ for(i=1;i<=NF;i++) if($i ~ /^978/) print $i; }' datafile > outfile This typically works. However, the new data file is an .xml file, and this command is no longer working... (6 Replies)
Discussion started by: palex
6 Replies

5. Shell Programming and Scripting

CSV file data extraction

Hi I am writing a shell script to parse a CSV file , in which i am facing a problem to separate the columns . Could some one help me with it. IN301330/00001 pvavan kumar limited xyz@ttccpp.com IN302148/00002 PRECIOUS SECURITIES (P) LTD viash@yahoo.co.in IN300239/00000 CENTRE india... (8 Replies)
Discussion started by: nanduri
8 Replies

6. Shell Programming and Scripting

Data extraction from .txt file

Hey all, i´ve got the following problem: i´m aquiring data with an instrument and i get data in a .txt file. This is how the txt file looks like: Report of AU program poptau F1P=-49.986ppm F2P=-110.014ppm Target directory for serfile: D:/data/Spect500/nmr/Thoma/882 Linear... (17 Replies)
Discussion started by: expikx
17 Replies

7. Shell Programming and Scripting

data extraction from a file

Hi Freinds, I have a file1.txt in the following format File1.txt I want to get 2 files from the above file filextra.txt should have the lines which are ending with "<" and remaining lines in the filecompare.txt file. Please help. (3 Replies)
Discussion started by: i150371485
3 Replies

8. Shell Programming and Scripting

Help needed XML Field Extraction

I had an immediate work to sort out the error code and error message which are associated within the log. But here im facing an problem to extract 3 different fields from the XML log can some one please help. I tried using different script including awk & nawk, but not getting the desired output. ... (18 Replies)
Discussion started by: raghunsi
18 Replies

9. Shell Programming and Scripting

Data Extraction From a File

Hi All, I have a requirement where I have to search the file with some text say "Exception". This exception word can be repeated for more then 10 times. Suppose the "Exception" word is repeated at line numbers say x=10, 50, 60, 120. Now I want to extract all the lines starting from x-5 to... (3 Replies)
Discussion started by: rrangaraju
3 Replies

10. UNIX for Advanced & Expert Users

extraction of data from a text file which follows certain pattern

hi everybody, i have a file, in it I need to extract some data that follows a particular pattern.. For example: my file contains like now running Speak225 sep 22 mon 16:34:05 2008 -------------------------------- ... (4 Replies)
Discussion started by: mohkris
4 Replies
Login or Register to Ask a Question