Unix/Linux Go Back    


CentOS 7.0 - man page for locale::recode (centos section 3)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)


Locale::Recode(3)	       User Contributed Perl Documentation		Locale::Recode(3)

NAME
       Locale::Recode - Object-Oriented Portable Charset Conversion

SYNOPSIS
	 use Locale::Recode;

	 $cd = Locale::Recode->new (from => 'UTF-8',
				    to	 => 'ISO-8859-1');

	 die $cd->getError if $cd->getError;

	 $cd->recode ($text) or die $cd->getError;

	 $mime_name = Locale::Recode->resolveAlias ('latin-1');

	 $supported = Locale::Recode->getSupported;

	 $complete = Locale::Recode->getCharsets;

DESCRIPTION
       This module provides routines that convert textual data from one codeset to another in a
       portable way.  The module has been started before Encode(3) was written.  It's main
       purpose today is to provide charset conversion even when Encode(3) is not available on the
       system.	It should also work for older Perl versions without Unicode support.

       Internally Locale::Recode(3) will use Encode(3) whenever possible, to allow for a faster
       conversion and for a wider range of supported charsets, and will only fall back to the
       Perl implementation when Encode(3) is not available or does not support a particular
       charset that Locale::Recode(3) does.

       Locale::Recode(3) is part of libintl-perl, and it's main purpose is actually to implement
       a portable charset conversion framework for the message translation facilities described
       in Locale::TextDomain(3).

CONSTRUCTOR
       The constructor "new()" requires two named arguments:

       from
	   The encoding of the original data.  Case doesn't matter, aliases are resolved.

       to  The target encoding.  Again, case doesn't matter, and aliases are resolved.

       The constructor will never fail.  In case of an error, the object's internal state is set
       to bad and it will refuse to do any conversions.  You can inquire the reason for the
       failure with the method getError().

OBJECT METHODS
       The following object methods are available.

       recode (STRING)
	   Converts STRING from the source encoding into the destination encoding.  In case of
	   success, a truth value is returned, false otherwise.  You can inquire the reason for
	   the failure with the method getError().

       getError
	   Returns either false if the object is not in an error state or an error message.

CLASS METHODS
       The object provides some additional class methods:

       getSupported
	   Returns a reference to a list of all supported charsets.  This may implicitely load
	   additional Encode(3) conversions like Encode::HanExtra(3) which may produce
	   considerable load on your system.

	   The method is therefore not intended for regular use but rather for getting resp.
	   displaying once a list of available encodings.

	   The members of the list are all converted to uppercase!

       getCharsets
	   Like getSupported() but also returns all available aliases.

SUPPORTED CHARSETS
       The range of supported charsets is system-dependent.  The following somewhat special
       charsets are always available:

       UTF-8
	   UTF-8 is available independently of your Perl version.  For Perl 5.6 or better or in
	   the presence of Encode(3), conversions are not done in Perl but with the interfaces
	   provided by these facilities which are written in C, hence much faster.

	   Encoding data into UTF-8 is fast, even if it is done in Perl.  Decoding it in Perl may
	   become quite slow.  If you frequently have to decode UTF-8 with Locale::Recode you
	   will probably want to make sure that you do that with Perl 5.6 or beter, or install
	   Encode(3) to speed up things.

       INTERNAL
	   UTF-8 is fast to write but hard to read for applications.  It is therefore not the
	   worst for internal string representation but not far from that.  Locale::Recode(3)
	   stores strings internally as a reference to an array of integer values like most
	   programming languages (Perl is an exception) do, trading memory for performance.

	   The integer values are the UCS-4 codes of the characters in host byte order.

	   The encoding INTERNAL is directly availabe via Locale::Recode(3) but of course you
	   should not really use it for data exchange, unless you know what you are doing.

       Locale::Recode(3) has native support for a plethora of other encodings, most of them 8 bit
       encodings that are fast to decode, including most encodings used on popular micros like
       the ISO-8859-* series of encodings, most Windows-* encodings (also known as CP*),
       Macintosh, Atari, etc.

NAMES AND ALIASES
       Each charset resp. encoding is available internally under a unique name.  Whenever the
       information was available, the preferred MIME name (see
       <http://www.iana.org/assignments/character-sets/>) was chosen as the internal name.

       Alias handling is quite strict.	The module does not make wild guesses at what you mean
       ("What's the meaning of the acronym JIS" is a valid alias for "7bit-jis" in Encode(3)
       ....) but aims at providing common aliases only.  The same applies to so-called aliases
       that are really mistakes, like "utf8" for UTF-8.

       The module knows all aliases that are listed with the IANA character set registry
       (<http://www.iana.org/assignments/character-sets/>), plus those known to libiconv version
       1.8, and a bunch of additional ones.

CONVERSION TABLES
       The conversion tables have either been taken from official sources like the IANA or the
       Unicode Consortium, from Bruno Haible's libiconv, or from the sources of the GNU libc and
       the regression tests for libintl-perl will check for conformance here.  For some encodings
       this data differs from Encode(3)'s data which would cause these tests to fail.  In these
       cases, the module will not invoke the Encode(3) methods, but will fall back to the
       internal implementation for the sake of consistency.

       The few encodings that are affected are so simple that you will not experience any real
       performance penalty unless you convert large chunks of data.  But the package is not
       really intended for such use anyway, and since Encode(3) is relatively new, I rather think
       that the differences are bugs in Encode which will be fixed soon.

BUGS
       The module should provide fall back conversions for other Unicode encoding schemes like
       UCS-2, UCS-4 (big- and little-endian).

       The pure Perl UTF-8 decoder will not always handle corrupt UTF-8 correctly, especially at
       the end and at the beginning of the string.  This is not likely to be fixed, since the
       module's intention is not to be a consistency checker for UTF-8 data.

AUTHOR
       Copyright (C) 2002-2009, Guido Flohr <guido@imperia.net>, all rights reserved.  See the
       source code for details.

       This software is contributed to the Perl community by Imperia (<http://www.imperia.net/>).

SEE ALSO
       Encode(3), iconv(3), iconv(1), recode(1), perl(1)

POD ERRORS
       Hey! The above document had some coding errors, which are explained below:

       Around line 369:
	   =cut found outside a pod block.  Skipping to next block.

perl v5.16.3				    2014-06-10				Locale::Recode(3)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums


All times are GMT -4. The time now is 05:55 AM.