Sponsored Content
Top Forums Shell Programming and Scripting Parse XML file encoded in ISO-8859-1 Post 302160645 by madhavim on Tuesday 22nd of January 2008 08:36:40 AM
Old 01-22-2008
Question Parse XML file encoded in ISO-8859-1

Dear Friends,

I have an XML file that's encoded in ISO-8859-1. I have some European characters coming in from 2 fields (Name, Comments) in the XML file. Can anyone suggest if there are any functions in Unix to read those characters? Using shell programming, can I parse this xml file?

Please suggest. I have not done this before.

Regards,
Madhavi.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to parse a XML file using PERL and XML::DOm

I need to know the way. I have got parsing down some nodes. But I was unable to get the child node perfectly. If you have code please send it. It will be very useful for me. (0 Replies)
Discussion started by: girigopal
0 Replies

2. Shell Programming and Scripting

Parse XML file

Hi, I need to parse the following XML data enclosed in <a> </a> XML tag using shell script. <X> ..... </X> <a> <b> <c>data1</c> <c>data2</c> </b> <d> <c>data3</c> </d> </a> <XX> ... </XX> (5 Replies)
Discussion started by: viki
5 Replies

3. Shell Programming and Scripting

How can I parse xml file?

How can I parse file containing xml ? I am sure that its best to use perl - but my perl is not very good - can someone help? Example below contents of file containing the xml - I basically want to parse the file and have each field contained in a variable.. ie. I want to store the account... (14 Replies)
Discussion started by: frustrated1
14 Replies

4. Emergency UNIX and Linux Support

How to parse the following xml file

Hi, I have the following file Example.xml <?xml version="1.0" encoding="iso-8859-1"?> <html><set label="09/07/29" value="1241.90"/> </html> Can any one help me in parsing this xml file I want to retrive the attribute values of the tag set Example I want to... (3 Replies)
Discussion started by: Raji_gadam
3 Replies

5. Shell Programming and Scripting

parse xml file

Hello all, Given the following extract from a xml file with multiple <JOB> .... </JOB> entries <JOB APPLICATION="APP" APR="0" AUG="0" AUTHOR="AUT" AUTOARCH="0" CMDLINE="/tmp/test1 %%var" CONFIRM="1" CREATION_DATE="20100430" CREATION_TIME="130739" ... (2 Replies)
Discussion started by: cabrao
2 Replies

6. Programming

Parse XML file

How do I get the field info for tags ID, NAME, DESCRIPTION. Below is my current code put I can't get beyond the first_child of the file. use strict; use warnings; use XML::Simplehttp://images.intellitxt.com/ast/adTypes/icon1.png; use... (1 Reply)
Discussion started by: leemalloy
1 Replies

7. UNIX for Dummies Questions & Answers

Parse xml file

HI Guys, Input .XML <xn:MeContext id="L0307"> <xn:ManagedElement id="1"> <xn:VsDataContainer id="1"> <xn:attributes> <xn:vsDataType>vsDataENodeBFunction</xn:vsDataType> ... (3 Replies)
Discussion started by: pareshkp
3 Replies

8. Shell Programming and Scripting

Parse XML File.

HI Guys I have Below XML File : <xn:SubNetwork id="XYZ"> <xn:SubNetwork id="C01"> <xn:MeContext id="CO1"> <xn:ManagedElement id="1"> <un:RncFunction id="1"> <un:UtranCell id="NY431"> ... (2 Replies)
Discussion started by: pareshkp
2 Replies

9. Shell Programming and Scripting

Parse xml file

I am trying to create a shell script that will parse an xml file (file attached). awk '/Id v=/ { print }' Test.xml | sed 's!<Id v=\"\(.*\)\"/>!\1!' > output.txt An output.txt file is created but it is empty. It should contain the value 222159 in it. Thanks. (7 Replies)
Discussion started by: cmccabe
7 Replies

10. Shell Programming and Scripting

Create .nfo file in ISO-8859-1 or UTF-8

Hey guys, I have a little problem, Let's say I create this script : #!/bin/sh nfo_file="/home/admin/info.nfo" echo "▒▒█ Hello █▒▒" > $nfo_fileIt seems to be okay : cat /home/admin/info.nfo ▒▒█ Hello █▒▒file -bi /home/admin/info.nfo text/plain; charset=utf-8But when I open it in a... (7 Replies)
Discussion started by: antoinelomb
7 Replies
XML::UM(3)						User Contributed Perl Documentation						XML::UM(3)

NAME
XML::UM - Convert UTF-8 strings to any encoding supported by XML::Encoding SYNOPSIS
use XML::UM; # Set directory with .xml files that comes with XML::Encoding distribution # Always include the trailing slash! $XML::UM::ENCDIR = '/home1/enno/perlModules/XML-Encoding-1.01/maps/'; # Create the encoding routine my $encode = XML::UM::get_encode ( Encoding => 'ISO-8859-2', EncodeUnmapped => &XML::UM::encode_unmapped_dec); # Convert a string from UTF-8 to the specified Encoding my $encoded_str = $encode->($utf8_str); # Remove circular references for garbage collection XML::UM::dispose_encoding ('ISO-8859-2'); DESCRIPTION
This module provides methods to convert UTF-8 strings to any XML encoding that XML::Encoding supports. It creates mapping routines from the .xml files that can be found in the maps/ directory in the XML::Encoding distribution. Note that the XML::Encoding distribution does install the .enc files in your perl directory, but not the.xml files they were created from. That's why you have to specify $ENCDIR as in the SYNOPSIS. This implementation uses the XML::Encoding class to parse the .xml file and creates a hash that maps UTF-8 characters (each consisting of up to 4 bytes) to their equivalent byte sequence in the specified encoding. Note that large mappings may consume a lot of memory! Future implementations may parse the .enc files directly, or do the conversions entirely in XS (i.e. C code.) get_encode (Encoding => STRING, EncodeUnmapped => SUB) The central entry point to this module is the XML::UM::get_encode() method. It forwards the call to the global $XML::UM::FACTORY, which is defined as an instance of XML::UM::SlowMapperFactory by default. Override this variable to plug in your own mapper factory. The XML::UM::SlowMapperFactory creates an instance of XML::UM::SlowMapper (and caches it for subsequent use) that reads in the .xml encod- ing file and creates a hash that maps UTF-8 characters to encoded characters. The get_encode() method of XML::UM::SlowMapper is called, finally, which generates an anonimous subroutine that uses the hash to convert multi-character UTF-8 blocks to the proper encoding. dispose_encoding ($encoding_name) Call this to free the memory used by the SlowMapper for a specific encoding. Note that in order to free the big conversion hash, the user should no longer have references to the subroutines generated by get_encode(). The parameters to the get_encode() method (defined as name/value pairs) are: o Encoding The name of the desired encoding, e.g. 'ISO-8859-2' o EncodeUnmapped (Default: &XML::UM::encode_unmapped_dec) Defines how Unicode characters not found in the mapping file (of the specified encoding) are printed. By default, they are converted to decimal entity references, like '&#123;' Use &XML::UM::encode_unmapped_hex for hexadecimal constants, like '&#xAB;' CAVEATS
I'm not exactly sure about which Unicode characters in the range (0 .. 127) should be mapped to themselves. See comments in XML/UM.pm near %DEFAULT_ASCII_MAPPINGS. The encodings that expat supports by default are currently not supported, (e.g. UTF-16, ISO-8859-1), because there are no .enc files avail- able for these encodings. This module needs some more work. If you have the time, please help! AUTHOR
Send bug reports, hints, tips, suggestions to Enno Derksen at <enno@att.com>. perl v5.8.0 2000-02-17 XML::UM(3)
All times are GMT -4. The time now is 09:27 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy