Manipulating xml data with awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Manipulating xml data with awk
# 1  
Old 02-22-2013
Manipulating xml data with awk

Hi everyone,

I have a little bit of complicated task to finish with AWK. Here it is;
I have a data file in xml format which looks like this
Code:
<data>
a1 a2 a3 a4 a5
b1 b2 b3 b4 b5
c1 c2 c3 c4 c5
d1 d2 d3 d4 d5
e1 e2 e3 e4 e5
</data>

lets say each data block contains 5 rows and 5 columns, what I need to do is this;
I have a condition, and I need to find the row that satisfies this condition then
I need to add an extra field to each row whose value will be calculated using
the columns of the row that satisfies the condition as well as other columns in
other rows. As an example, lets say row "c" satisfies my condition, then I add
an extra field to data which will look like this
Code:
<data>
a1 a2 a3 a4 a5 a6
b1 b2 b3 b4 b5 b6
c1 c2 c3 c4 c5 c6
d1 d2 d3 d4 d5 d6
e1 e2 e3 e4 e5 e6
</data>

where the last fields are calculated as following;
Code:
a6 = c2*a2 + c3*a3 + c4*a4 + c5*a5
b6 = c2*b2 + c3*b3 + c4*b4 + c5*b5
c6 = c2*c2 + c3*c3 + c4*c4 + c5*c5
d6 = c2*d2 + c3*d3 + c4*d4 + c5*d5
e6 = c2*e2 + c3*e3 + c4*e4 + c5*e5

the algebra on the above calculation may not necessarily be simple as this.

Thanks for any help.
# 2  
Old 02-22-2013
Please post a sample of the input and the desired output.

Mention the condition to be tested for.

Show how the real data is, how the "data" blocks are separated from one another.

Also, mention the "real" calculations.

Do not oversimplify. This often leads to frequent changes in requirements with subsequent confusion.
# 3  
Old 02-23-2013
Here is two data blocks from my real data file
Code:
<data>
21    0.00  0.00  0.57  0.57
21    0.00  0.00 -0.19  0.19
6    -0.63  0.12  0.31  0.37
24   -0.44  0.15  0.25  0.30
-13  -0.23  0.37  0.13  0.14
</data>
<data>
1     0.00  0.00  0.10  0.10
-1    0.00  0.00 -0.66  0.66
6    -0.17  0.40  0.27  0.32
24   -0.48 -0.24  0.12  0.15
-13   0.17 -0.44  0.33  0.18
</data>

take the row where $1==6 to use its fields in further calculations
and do the following algebraic calculations
Code:
a6 = c2*a2 - c3*a3 - c4*a4 - c5*a5
b6 = c2*b2 - c3*b3 - c4*b4 - c5*b5
c6 = c2*c2 - c3*c3 - c4*c4 - c5*c5
d6 = c2*d2 - c3*d3 - c4*d4 - c5*d5
e6 = c2*e2 - c3*e3 - c4*e4 - c5*e5

After this operation these two data blocks will look like
Code:
<data>
21    0.00  0.00  0.57  0.57  a6
21    0.00  0.00 -0.19  0.19  b6
6    -0.63  0.12  0.31  0.37  c6
24   -0.44  0.15  0.25  0.30  d6
-13  -0.23  0.37  0.13  0.14  e6
</data>
<data>
1     0.00  0.00  0.10  0.10  a6
-1    0.00  0.00 -0.66  0.66  b6
6    -0.17  0.40  0.27  0.32  c6
24   -0.48 -0.24  0.12  0.15  d6
-13   0.17 -0.44  0.33  0.18  e6
</data>

where the numerical values of the last column (for the 2nd data block as an example)
Code:
a6 = (-0.17)*(0.00) -(0.40)*(0.00) -(0.27)*(0.10) -(0.32)*(0.10)   = -0.0590
b6 = (-0.17)*(0.00) -(0.40)*(0.00) -(0.27)*(-0.66)-(0.32)*(0.66)   = -0.0330
c6 = (-0.17)*(-0.17)-(0.40)*(0.40) -(0.27)*(0.27) -(0.32)*(0.32)   = -0.3064
d6 = (-0.17)*(-0.48)-(0.40)*(-0.24)-(0.27)*(0.12) -(0.32)*(0.15)   =  0.0972 
e6 = (-0.17)*(0.17) -(0.40)*(-0.44)-(0.27)*(0.33) -(0.32)*(0.18)   =  0.0004

# 4  
Old 02-23-2013
Try this code:
Code:
awk ' /<data>/ {
        f = 1;
        print $0
        next;
} /<\/data>/ && s {
        f = 0;
        m = 0;
        k = 12;
        while(m < j)
        {
                for(i=1;i<=nf;i++)
                {
                        printf "%s\t", a[i,++m];
                        if(i >= 2)
                        {
                                a6 -= (a[i,k] * a[i,m]);
                                ++k;
                        }
                }
                printf "%.4f", a6;
                printf "\n"
                if(k > nf) k=12;
        }
        j = 0;
        print $0;
} f == 1 {
        for(i=1;i<=NF;i++)
        {
                a[i,++j] = $i;
        }
        if(a[1,11] == 6 )
                s = 1;
        if(a[1,11] != 6 )
                s = 0;
        nf = NF;
}' file

# 5  
Old 02-23-2013
Hi bipi,
thanks for your efforts, but this code does not give desired answers,
could you please check it again.
# 6  
Old 02-23-2013
Oops I forgot to reinitialize a6

Modified code:
Code:
awk ' /<data>/ {
        f = 1;
        print $0
        next;
} /<\/data>/ && s {
        f = 0;
        m = 0;
        k = 12;
        while(m < j)
        {
                for(i=1;i<=nf;i++)
                {
                        printf "%s\t", a[i,++m];
                        if(i >= 2)
                        {
                                a6 -= (a[i,k] * a[i,m]);
                                ++k;
                        }
                }
                printf "%.4f", a6;
                a6 = 0;
                printf "\n"
                if(k > nf) k=12;
        }
        j = 0;
        print $0;
} f == 1 {
        for(i=1;i<=NF;i++)
        {
                a[i,++j] = $i;
        }
        if(a[1,11] == 6 )
                s = 1;
        if(a[1,11] != 6 )
                s = 0;
        nf = NF;
}' file

Here is the O/P that I am getting:
Code:
$ ./hayreter
<data>
21      0.00    0.00    0.57    0.57    -0.3876
21      0.00    0.00    -0.19   0.19    -0.0114
6       -0.63   0.12    0.31    0.37    -0.6443
24      -0.44   0.15    0.25    0.30    -0.4837
-13     -0.23   0.37    0.13    0.14    -0.2814
</data>
<data>
1       0.00    0.00    0.10    0.10    -0.0590
-1      0.00    0.00    -0.66   0.66    -0.0330
6       -0.17   0.40    0.27    0.32    -0.3642
24      -0.48   -0.24   0.12    0.15    -0.0660
-13     0.17    -0.44   0.33    0.18    0.0582

# 7  
Old 02-23-2013
this is getting better Smilie
first 2 row is correct but when it comes to 3rd row (that satisfies the condition)
answers are getting wrong

for you to check your results, look at the table that I posted yesterday above
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Manipulating Data Records for reporting

Hello All, I have Data Records (DRs) with the following format: ... (2 Replies)
Discussion started by: EAGL€
2 Replies

2. Shell Programming and Scripting

Data manipulating script. Please HELP!

Dear friends, I'm struggling to preparing a bunch of gromacs input files, say manually. It's really a time-consuming work without any techniques. I suppose that it could be done by a smart script automatically. But I lack some basic knowledge on scripting. Please help! My original input looks... (3 Replies)
Discussion started by: liuzhencc
3 Replies

3. Shell Programming and Scripting

manipulating data

Hi guys Firstly, I'd like to say hi and how great this forum is. I'm not new to UNIX but am relatively new to scripting. I have a personal project that I'm working on just to try and speed up my learning. I working with a text file, well more of a logfile really. It has several columns of... (6 Replies)
Discussion started by: abcd69
6 Replies

4. Emergency UNIX and Linux Support

Manipulating Data

Hi. I haven't had to write bash scripts in a long time and have a simple task to do, but need some help: Input: chrY:22627291-22651542 chrY:23045932-23070172 chrY:23684890-23696359 chrY:25318610-25330083 chrY:25451096-25462570 chr10:1054847-1061799 chr10:1058606-1080131... (7 Replies)
Discussion started by: awknerd
7 Replies

5. Shell Programming and Scripting

Using AWK to separate data from a large XML file into multiple files

I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically <FMPXMLRESULT> <METADATA> <FIELD att="............." id="..."/> </METADATA> <RESULTSET FOUND="1763457"> <ROW att="....." etc="...."> ... (16 Replies)
Discussion started by: JRy
16 Replies

6. Shell Programming and Scripting

Manipulating Pick multi dimensional data with awk.

Hi. I am reasonably new to awk, but have done quite a lot of unix scripting in the past. I have resolved the issues below with unix scripting but it runs like a dog. Moved to awk for speed and functionality but running up a big learning curve in a hurry, so hope there is some help here. I... (6 Replies)
Discussion started by: mike.strategis
6 Replies

7. Shell Programming and Scripting

sed or awk to extract data from Xml file

Hi, I want to get data from Xml file by using sed or awk command. I want to get the following result : mon titre 1;Createur1;Dossier1 mon titre 1;Createur1;Dossier1 and save it in cvs file (fichier.cvs). FROM this Xml file (test.xml): <playlist version="1"> <trackList> <track>... (1 Reply)
Discussion started by: yeclota
1 Replies

8. Shell Programming and Scripting

Manipulating data in variable

Hi, I have two variables - A and B - containing a bunch of file paths. I am comparing them and when I find a match I want to remove that entry from A so that as the compare proceeds A shrinks entry by entry. How can I remove a matched entry from A whilst leaving the non matched entries... (6 Replies)
Discussion started by: ajcannon
6 Replies

9. Shell Programming and Scripting

extract data from xml- shell script using awk

Hi, This is the xml file that i have. - <front-servlet platform="WAS4.0" request-retriever="SiteMinder-aware" configuration-rescan-interval="60000"> <concurrency-throttle maximum-concurrency="50" redirect-page="/jsp/defaulterror.jsp" /> - <loggers> <instrumentation... (5 Replies)
Discussion started by: nishana
5 Replies

10. UNIX for Dummies Questions & Answers

Parsing XML dynamic data via awk?

I am trying to use a line of output in an XML file as input in another new XML file for processing purposes via a shell script. Since I am a newbie though, I'm not sure how to do this since the data is different everytime. I am using this technique with static data right now: echo -n "Running... (5 Replies)
Discussion started by: corwin43
5 Replies
Login or Register to Ask a Question