Sponsored Content
Top Forums Shell Programming and Scripting How to do find differences between 2 XML Files? Post 302916298 by Ariean on Monday 8th of September 2014 03:41:23 PM
Old 09-08-2014
How to do find differences between 2 XML Files?

Hello All,

Requirement is to compare 2 XML files and see if there are any differences but from some of the providers We are receiving UTF-16 formatted XML file with no end of line as shown below.

Excerpt of data file:
Code:
ÿþ<^@?^@x^@m^@l^@ ^@v^@e^@r^@s^@i^@o^@n^@=^@"^@1^@.^@0^@"^@ ^@e^@n^@c^@o^@d^@i^@n^@g^@=^@"^@U^@T^@F^@-^@1^@6^@"^@ ^@?^@>^@<^@P^@r^@o^@v^@i^@d^@e^@r^@ ^@x^^@l^@n^@s^@=^@"^@h^@t^@t^@p^@:^@/^@/^@w^@w^@w^@.^@f^@c^@a^@.^@g^@o^@v^@/^@F^@C^@S^@L^@o^@a^@n^@s^@"^@ ^@x^@m^@l^@n^@s^@:^@x^@s^@i^@=^@"^@h^@t^@t^@p^@:^@/^@/^@w^@w^@w^@.^@w^@3^@.^@o^@r^@g^@/^@2^@0^@0^@1^@/^@X^@M^@L^@S^@c^@h^@e^@m^@a^@-^@i^@n^@s^@t^@a^@n^@c^@e^@"^@

2014-03-31_17_V2.5.XML [readonly][noeol][converted] 2L, 18676154C


I used iconv command to convert this file to UTF-8 formatted file.
Now i can see the data in the XML file visible to human eyes but everything is coming out as a single line.

Code:
wc -l 2014-03-31_17_V2.5.XML.utf8
	1 2014-03-31_17_V2.5.XML.utf8

How could i put of end of lines after each XML tag?

once i align the XML tags in my data file with end of line characters, then i want to do DIFF between two XML files to find the differences. please help.


Thank you.
 

10 More Discussions You Might Find Interesting

1. HP-UX

Compare 2 systems to find any differences

Hi there, I have 2 machines running HP-UX. One off these controllers is able to send mail and the other cannot. I have looked at all the settings that I know and coannot find any differences. Is there a way to audit the 2 machinces by pulling all the settings then compare any differences? ... (2 Replies)
Discussion started by: lodey
2 Replies

2. Shell Programming and Scripting

Differences between 2 Flat Files and process the differences

Hi Hope you are having a great weeknd !! I had a question and need your expertise for this : I have 2 files File1 & File2(of same structure) which I need to compare on some columns. I need to find the values which are there in File2 but not in File 1 and put the Differences in another file... (5 Replies)
Discussion started by: newbie_8398
5 Replies

3. Shell Programming and Scripting

Find required files by pattern in xml files and the change the pattern on Linux

Hello, I need to find all *.xml files that matched by pattern on Linux. I need to have written the file name on the screen and then change the pattern in the file just was found. For instance. I can start the script with arguments for keyword and for value, i.e script.sh keyword... (1 Reply)
Discussion started by: yart
1 Replies

4. Shell Programming and Scripting

Read column and find differences...

I have this file 427 A C A/C 12 436 G C G/C 12 445 C T C/T 12 447 A G A/G 9 451 T C T/C 5 456 A G A/G 12 493 G A G/A 12 I wanted to read the first column and find all other ids which are differences less than 10. 427 A C A/C 12 436 436 G C G/C 12 427,445... (7 Replies)
Discussion started by: empyrean
7 Replies

5. Shell Programming and Scripting

find un-closed tags in XML files

Hi All, I am trying to validate XMLs from a folder: Input Directory having multiple XML files: File1.xml <Root> <Parent> <Child Name="One"> <Foo>...</Foo> <Bar>...</Bar> <Baz>...</Baz> </Child> <Child Name="Two"> <Foo>...</Foo>... (3 Replies)
Discussion started by: unme
3 Replies

6. Shell Programming and Scripting

Comparing 2 xml files and print the differences only in output

Hi....I'm having 2 xml files, one is having some special characters and another is a clean xml file does not have any special characters. Now I need one audit kind of file which will show me only from which line the special characters have been removed and the special characters. Can you please... (1 Reply)
Discussion started by: Krishanu Saha
1 Replies

7. Shell Programming and Scripting

Extract strings from XML files and create a new XML

Hello everybody, I have a double mission with some XML files, which is pretty challenging for my actual beginner UNIX knowledge. I need to extract some strings from multiple XML files and create a new XML file with the searched strings.. The original XML files contain the source code for... (12 Replies)
Discussion started by: milano.churchil
12 Replies

8. Shell Programming and Scripting

Splitting xml file into several xml files using perl

Hi Everyone, I'm new here and I was checking this old post: /shell-programming-and-scripting/180669-splitting-file-into-several-smaller-files-using-perl.html (cannot paste link because of lack of points) I need to do something like this but understand very little of perl. I also check... (4 Replies)
Discussion started by: mcosta
4 Replies

9. Shell Programming and Scripting

Splitting a single xml file into multiple xml files

Hi, I'm having a xml file with multiple xml header. so i want to split the file into multiple files. Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix. eg : <?xml version="1.0" encoding="UTF-8"?> <ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies

10. Shell Programming and Scripting

awk to find differences between two file

I am trying to find the differences between the two sorted, tab separated, attached files. Thank you :). In update2 there are 52,058 lines and in current2 there are 52,197 so 139 differences should result. However, awk 'FNR==NR{a;next}!($0 in a)' update2 current2 > out2comm -1 -3... (2 Replies)
Discussion started by: cmccabe
2 Replies
XML::UM(3)						User Contributed Perl Documentation						XML::UM(3)

NAME
XML::UM - Convert UTF-8 strings to any encoding supported by XML::Encoding SYNOPSIS
use XML::UM; # Set directory with .xml files that comes with XML::Encoding distribution # Always include the trailing slash! $XML::UM::ENCDIR = '/home1/enno/perlModules/XML-Encoding-1.01/maps/'; # Create the encoding routine my $encode = XML::UM::get_encode ( Encoding => 'ISO-8859-2', EncodeUnmapped => &XML::UM::encode_unmapped_dec); # Convert a string from UTF-8 to the specified Encoding my $encoded_str = $encode->($utf8_str); # Remove circular references for garbage collection XML::UM::dispose_encoding ('ISO-8859-2'); DESCRIPTION
This module provides methods to convert UTF-8 strings to any XML encoding that XML::Encoding supports. It creates mapping routines from the .xml files that can be found in the maps/ directory in the XML::Encoding distribution. Note that the XML::Encoding distribution does install the .enc files in your perl directory, but not the.xml files they were created from. That's why you have to specify $ENCDIR as in the SYNOPSIS. This implementation uses the XML::Encoding class to parse the .xml file and creates a hash that maps UTF-8 characters (each consisting of up to 4 bytes) to their equivalent byte sequence in the specified encoding. Note that large mappings may consume a lot of memory! Future implementations may parse the .enc files directly, or do the conversions entirely in XS (i.e. C code.) get_encode (Encoding => STRING, EncodeUnmapped => SUB) The central entry point to this module is the XML::UM::get_encode() method. It forwards the call to the global $XML::UM::FACTORY, which is defined as an instance of XML::UM::SlowMapperFactory by default. Override this variable to plug in your own mapper factory. The XML::UM::SlowMapperFactory creates an instance of XML::UM::SlowMapper (and caches it for subsequent use) that reads in the .xml encod- ing file and creates a hash that maps UTF-8 characters to encoded characters. The get_encode() method of XML::UM::SlowMapper is called, finally, which generates an anonimous subroutine that uses the hash to convert multi-character UTF-8 blocks to the proper encoding. dispose_encoding ($encoding_name) Call this to free the memory used by the SlowMapper for a specific encoding. Note that in order to free the big conversion hash, the user should no longer have references to the subroutines generated by get_encode(). The parameters to the get_encode() method (defined as name/value pairs) are: o Encoding The name of the desired encoding, e.g. 'ISO-8859-2' o EncodeUnmapped (Default: &XML::UM::encode_unmapped_dec) Defines how Unicode characters not found in the mapping file (of the specified encoding) are printed. By default, they are converted to decimal entity references, like '&#123;' Use &XML::UM::encode_unmapped_hex for hexadecimal constants, like '&#xAB;' CAVEATS
I'm not exactly sure about which Unicode characters in the range (0 .. 127) should be mapped to themselves. See comments in XML/UM.pm near %DEFAULT_ASCII_MAPPINGS. The encodings that expat supports by default are currently not supported, (e.g. UTF-16, ISO-8859-1), because there are no .enc files avail- able for these encodings. This module needs some more work. If you have the time, please help! AUTHOR
Send bug reports, hints, tips, suggestions to Enno Derksen at <enno@att.com>. perl v5.8.0 2000-02-17 XML::UM(3)
All times are GMT -4. The time now is 04:47 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy