Sponsored Content
Top Forums Shell Programming and Scripting XML Fields comparison using awk script Post 302965474 by VasuKukkapalli on Friday 29th of January 2016 01:21:19 PM
Old 01-29-2016
Linux XML Fields comparison using awk script

Hello All,

I have many zipped XMLs (example file name in tgz formate - file_rec.trx.2016-01-23.000123.exc.85sesdzd45wsds5299c8f2994f7.tgz) looks following and I need to verify two numbers, they are RecordNumber and EnrolData (only sequence number, NOT hole).
for all the records, both should be equal, but as an error, for some records, record number is NOT same as EnrolData's sequence number. I need to find out what all those records and in which files. could some one please help me? I have tried this using following awk script but no luck.

XML Format:
Code:
<XXXXXXXXXXXXX>
    <RecordNumber>12345</RecordNumber>
    <XXXXXX>XXXXXX</XXXXXX>
    <XXXXXX>XXXXXX</XXXXXX>
    <XXXXXX>XXXXXX</XXXXXX>
    <XXXXXXXXXXXXX><![CDATA[XXXXXXXXXXXXXX:XXXXXXXXXXXXX XXXX XXXXXX]]></XXXXXXXXXXXXX>
    <EnrolData><![CDATA[E0000003350000000012345Part1              XXXXXX
	XXXXXXXXXXXXXXXX                                            XXXXXXXXXXXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXXXXXXXXXXX.XXX   
	XXXXXXXXXXXXXXX  
	XXXX                                                                                                                                                      
	
XXXXXXXXXXXXXXXXX                    XXXX                                XXXXXXXXXXXXX.XXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXX.XXXXXX                                                        XXXXXXXXXXXXXX                                      
        XXXX                          XXXX                          XXXXXXXXXXXXX                XXXX                          XXXXXXXXXXXXX       
		XXXX                          XXXXXXXXXXXXX                X                            
		XXXXXXXXXXXXX                                                                                             		
		XXXXXXXXXXXXXXXXXXX.XXXXXXXXXXXXXXXXXXXX.XXX                           XXX
]]></EnrolData>
</XXXXXXXXXXXXX>

Script that I am trying:
Code:
#!/bin/sh
for file in $(ls file_rec.trx.{4}(\d)-{2}(\d)-{2}(\d).{1,}(\d).exc.*.tgz)
do
awk'
 /<RecordNumber>/ {
        getline
        while ( $0 !~ /<\/RecordNumber>/ ) {
               rNumber = $1
                getline
        }
        nextline
}

/<EnrolData><\!\[CDATA\[/ {
        getline
        while ($0 !~ "\]\]><\/EnrolData>" ) {
               eData=substr($1,19,5) #Here I actually need to get the sub string from "E0000003350000000012345Part1              XXXXXX                                        " 
#but the problem is record number may not fixed digits and the number between Part1 and E may not be fixed digits. 
#one thing for sure is sequence number present always before Part1
                getline
        }
        nextline
}
{
if (rNumber==eData){
#here I need to print the formate - <filename> : <RecordNumber> - <EnrolData sequence number>
print "$file - $(rNumber) - $(eData)"
}' $file


Last edited by VasuKukkapalli; 01-29-2016 at 02:44 PM..
 

10 More Discussions You Might Find Interesting

1. HP-UX

XML parsing performace comparison with windows using sax

sorry wrong forum..i dont know how to delete this or how to move it to HP UX section... I tested SAX XML parsing using xerces(http://xerces.apache.org/xerces-j/). I tested on Windows XP and HP-UX . I found that parsing time on HP is 5 times that on Windows. My server startup reads a lot of XML... (1 Reply)
Discussion started by: saurabh.sid
1 Replies

2. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies

3. Shell Programming and Scripting

Simple XML file comparison and merging

Okay, first of all, thanks to everyone who's helped me out before... I appreciate the opportunity to learn. I have two iTunes XML files, and I simply want to compare the contents, then merge. Theoretically, this will allow me to merge two libraries, keeping playlists intact (depending on iTunes'... (4 Replies)
Discussion started by: karlp
4 Replies

4. Shell Programming and Scripting

awk script to (un)/concatenate fields in file

Hi everyone, I'm trying to use the "join" function for more than 1 field. Since it's not possible as it is, I want to take my input files and concatenate the joining fields as 1 field (separated by "|"). I wrote 2 awk script to do and undo it (see below). However I'm new to awk and I'm certain I... (5 Replies)
Discussion started by: anthony.cros
5 Replies

5. Shell Programming and Scripting

numbers comparison in fields of a file and print least value of them

Hi , I'm trying to compare fields in the file, I want compare the numbers in each column and get the least value of it. > cat input_file 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 -0.2050 -0.6629 -0.6407 -0.6599 -0.4085 -0.3959 -0.2526 -0.3597 0.3439 0.2275 0.2780 ... (5 Replies)
Discussion started by: novice_man
5 Replies

6. Shell Programming and Scripting

Comparison of fields in Files

Hello, I have two files with tab delimited data. The file will contain details something like below: FILENAME.A.B.C. 3 5 VALID PROCESSED I would have a bench mark file. I would be getting new files of the same format. My requirement is to compare a particular field for a... (3 Replies)
Discussion started by: Praveenkulkarni
3 Replies

7. Shell Programming and Scripting

Awk - Script assistance on identifying non matching fields

Hoping for some assistance. my source file consists of: os, ip, username win7, 123.56.78, john win7, 123.56.78, paul win7, 10.1.1.1, john win7, 10.2.2.3, joe I've been trying to run a script that will only return ip and username where the IP address is the same and the username is... (3 Replies)
Discussion started by: tekvaio
3 Replies

8. Shell Programming and Scripting

How to get fields and get output with awk or shell script.?

I have a flat file A.txt with field seperate by a pipe 2012/11/13 20:06:11 | 284:hawk pid=014268 opened Locations 12, 13, 14, 15 for /home/hawk_t112/t112/macteam/qt/NET12/full_ddr3_2X_FV_4BD_1.qt/dbFiles/t112.proto|2012/11/14 15:19:26 | still running |norway|norway 2012/11/14 12:53:51 | ... (6 Replies)
Discussion started by: sabercats
6 Replies

9. Shell Programming and Scripting

How to print 1st field and last 2 fields together and the rest of the fields after it using awk?

Hi experts, I need to print the first field first then last two fields should come next and then i need to print rest of the fields. Input : a1,abc,jsd,fhf,fkk,b1,b2 a2,acb,dfg,ghj,b3,c4 a3,djf,wdjg,fkg,dff,ggk,d4,d5 Expected output: a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
Discussion started by: 100bees
6 Replies

10. Shell Programming and Scripting

awk sort based on difference of fields and print all fields

Hi I have a file as below <field1> <field2> <field3> ... <field_num1> <field_num2> Trying to sort based on difference of <field_num1> and <field_num2> in desceding order and print all fields. I tried this and it doesn't sort on the difference field .. Appreciate your help. cat... (9 Replies)
Discussion started by: newstart
9 Replies
All times are GMT -4. The time now is 07:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy