Manipulating xml data with awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Manipulating xml data with awk
# 8  
Old 02-23-2013
OK fixed it. But I don't know why it didn't work in the previous code!
Code:
awk ' /<data>/ {
        f = 1;
        print $0
        next;
} /<\/data>/ && s {
        f = 0;
        m = 0;
        k = 12;
        while(m < j)
        {
                for(i=1;i<=nf;i++)
                {
                        printf "%s\t", a[i,++m];
                        if(i >= 2)
                        {
                                b[i] = (a[i,k] * a[i,m]);
                                ++k;
                        }
                }
                a6 = b[2] - b[3] - b[4] - b[5];
                printf "%.4f", a6;
                a6 = 0;
                printf "\n"
                if(k > nf) k=12;
        }
        j = 0;
        print $0;
} f == 1 {
        for(i=1;i<=NF;i++)
        {
                a[i,++j] = $i;
        }
        if(a[1,11] == 6 )
                s = 1;
        if(a[1,11] != 6 )
                s = 0;
        nf = NF;
}' file

Current O/P:
Code:
$ ./hayreter
<data>
21      0.00    0.00    0.57    0.57    -0.3876
21      0.00    0.00    -0.19   0.19    -0.0114
6       -0.63   0.12    0.31    0.37    0.1495
24      -0.44   0.15    0.25    0.30    0.0707
-13     -0.23   0.37    0.13    0.14    0.0084
</data>
<data>
1       0.00    0.00    0.10    0.10    -0.0590
-1      0.00    0.00    -0.66   0.66    -0.0330
6       -0.17   0.40    0.27    0.32    -0.3064
24      -0.48   -0.24   0.12    0.15    0.0972
-13     0.17    -0.44   0.33    0.18    0.0004
</data>

This User Gave Thanks to Yoda For This Post:
# 9  
Old 02-23-2013
Try this (with assumptions about your data):
Code:
awk '/^[ \t]*<data>/{data=1;countrow=0;rec[countrow++]=$0;next}
/^[ \t]*<\/data>/{
 if(output) {
  print rec[0]
  n=split(rec[foundat],pattline)
  for(i=1;i<countrow;i++) {
   split(rec[i],otherline)
   c=0
   for(j=2;j<=n;j++) {
    if(j==2) {c=pattline[j]*otherline[j];continue}
    c-=(pattline[j]*otherline[j])
   }
   print rec[i],c
  }
  print
 } 
 else
  for(i in rec) delete rec[i]
 output=foundat=data=0;next
}
data{
 if(!output && $1=="6") { output=1; foundat=countrow }
 rec[countrow++]=$0
}' file

This User Gave Thanks to elixir_sinari For This Post:
# 10  
Old 02-23-2013
Both scripts works quite fine, I thank to you both so much.
Would you mind explain your codes line by line, I would appreciate a lot.
thanks again.
# 11  
Old 02-24-2013
Here is a brief explanation:
Code:
awk ' /<data>/ {                                                # Search for pattern: <data>
        f = 1;                                                  # If found set f = 1
        print $0                                                # Print <data> tag
        next;                                                   # Stop processing current record
} /<\/data>/ && s {                                             # Search for pattern: </data>
        f = 0;                                                  # If found set f = 0
        m = 0;                                                  # m = 0  (counter for fetching all array elements)
        k = 12;                                                 # k = 12 (index of 2nd element in row c)
        while(m < j)                                            # While m < j
        {
                for(i=1;i<=nf;i++)                              # For each records in 2D array
                {
                        printf "%s\t", a[i,++m];                # Print records
                        if(i >= 2)                              # If array index i > 2 (starting from second element)
                        {
                                b[i] = (a[i,k] * a[i,m]);       # Multiply current record with row c record.
                                ++k;
                        }
                }
                a6 = b[2] - b[3] - b[4] - b[5];                 # Subtract elements in array b and assign to a6
                printf "%.4f", a6;                              # Print a6
                a6 = 0;                                         # Reinitialize a6 = 0
                printf "\n"
                if(k > nf) k=12;                                # if k index goes beyond number of recs, reset to 12
        }
        j = 0;
        print $0;                                               # Print </data> tag
} f == 1 {                                                      # If f == 1
        for(i=1;i<=NF;i++)                                      # Creating a 2D array with elements in <data> tag.
        {
                a[i,++j] = $i;
        }
        if(a[1,11] == 6 )                                       # If row c first element == 6
                s = 1;                                          # Set s = 1
        if(a[1,11] != 6 )                                       # If row c first element != 6
                s = 0;                                          # Set s = 0
        nf = NF;                                                # Set nf = number of records in line NF
}' file

# 12  
Old 02-24-2013
Thanks again for this brief (but quite detailed) explanation.
I have just one question, how did you assign the row c?
And if our condition were to satisfied in row d instead of c
would this script still work? or is it not as that much generic?
# 13  
Old 02-24-2013
Here are the indices (i,j) for each data elements when stored in array:
Code:
1   (1,1)       0.00  (2,2)     0.00  (3,3)     0.10  (4,4)     0.10 (5,5)
-1  (1,6)       0.00  (2,7)     0.00  (3,8)     -0.66 (4,9)     0.66 (5,10)
6   (1,11)      -0.17 (2,12)    0.40  (3,13)    0.27  (4,14)    0.32 (5,15)
24  (1,16)      -0.48 (2,17)    -0.24 (3,18)    0.12  (4,19)    0.15 (5,20)
-13 (1,21)      0.17  (2,22)    -0.44 (3,23)    0.33  (4,24)    0.18 (5,25)

This is why in the code I am using j index 11 - 15 to identify row c elements. Hence you have to change the j index as per your requirement. I hope you understood.
# 14  
Old 02-24-2013
don't worry, I perfectly understood what you mean Smilie
thanks a lot bipi.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Manipulating Data Records for reporting

Hello All, I have Data Records (DRs) with the following format: ... (2 Replies)
Discussion started by: EAGL€
2 Replies

2. Shell Programming and Scripting

Data manipulating script. Please HELP!

Dear friends, I'm struggling to preparing a bunch of gromacs input files, say manually. It's really a time-consuming work without any techniques. I suppose that it could be done by a smart script automatically. But I lack some basic knowledge on scripting. Please help! My original input looks... (3 Replies)
Discussion started by: liuzhencc
3 Replies

3. Shell Programming and Scripting

manipulating data

Hi guys Firstly, I'd like to say hi and how great this forum is. I'm not new to UNIX but am relatively new to scripting. I have a personal project that I'm working on just to try and speed up my learning. I working with a text file, well more of a logfile really. It has several columns of... (6 Replies)
Discussion started by: abcd69
6 Replies

4. Emergency UNIX and Linux Support

Manipulating Data

Hi. I haven't had to write bash scripts in a long time and have a simple task to do, but need some help: Input: chrY:22627291-22651542 chrY:23045932-23070172 chrY:23684890-23696359 chrY:25318610-25330083 chrY:25451096-25462570 chr10:1054847-1061799 chr10:1058606-1080131... (7 Replies)
Discussion started by: awknerd
7 Replies

5. Shell Programming and Scripting

Using AWK to separate data from a large XML file into multiple files

I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically <FMPXMLRESULT> <METADATA> <FIELD att="............." id="..."/> </METADATA> <RESULTSET FOUND="1763457"> <ROW att="....." etc="...."> ... (16 Replies)
Discussion started by: JRy
16 Replies

6. Shell Programming and Scripting

Manipulating Pick multi dimensional data with awk.

Hi. I am reasonably new to awk, but have done quite a lot of unix scripting in the past. I have resolved the issues below with unix scripting but it runs like a dog. Moved to awk for speed and functionality but running up a big learning curve in a hurry, so hope there is some help here. I... (6 Replies)
Discussion started by: mike.strategis
6 Replies

7. Shell Programming and Scripting

sed or awk to extract data from Xml file

Hi, I want to get data from Xml file by using sed or awk command. I want to get the following result : mon titre 1;Createur1;Dossier1 mon titre 1;Createur1;Dossier1 and save it in cvs file (fichier.cvs). FROM this Xml file (test.xml): <playlist version="1"> <trackList> <track>... (1 Reply)
Discussion started by: yeclota
1 Replies

8. Shell Programming and Scripting

Manipulating data in variable

Hi, I have two variables - A and B - containing a bunch of file paths. I am comparing them and when I find a match I want to remove that entry from A so that as the compare proceeds A shrinks entry by entry. How can I remove a matched entry from A whilst leaving the non matched entries... (6 Replies)
Discussion started by: ajcannon
6 Replies

9. Shell Programming and Scripting

extract data from xml- shell script using awk

Hi, This is the xml file that i have. - <front-servlet platform="WAS4.0" request-retriever="SiteMinder-aware" configuration-rescan-interval="60000"> <concurrency-throttle maximum-concurrency="50" redirect-page="/jsp/defaulterror.jsp" /> - <loggers> <instrumentation... (5 Replies)
Discussion started by: nishana
5 Replies

10. UNIX for Dummies Questions & Answers

Parsing XML dynamic data via awk?

I am trying to use a line of output in an XML file as input in another new XML file for processing purposes via a shell script. Since I am a newbie though, I'm not sure how to do this since the data is different everytime. I am using this technique with static data right now: echo -n "Running... (5 Replies)
Discussion started by: corwin43
5 Replies
Login or Register to Ask a Question