Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

tex::encode(3pm) [debian man page]

TeX::Encode(3pm)					User Contributed Perl Documentation					  TeX::Encode(3pm)

NAME
TeX::Encode - Encode/decode Perl utf-8 strings into TeX SYNOPSIS
use TeX::Encode; use Encode; $tex = encode('latex', "This will encode an e-acute (".chr(0xe9).") as 'e"); $str = decode('latex', $tex); # Will decode the 'e too! DESCRIPTION
This module provides encoding to LaTeX escapes from utf8 using mapping tables in Pod::LaTeX and HTML::Entities. This covers only a subset of the Unicode character table (undef warnings will occur for non-mapped chars). This module is intentionally vague about what it will handle, see Caveats below. Mileage will vary when decoding (converting TeX to utf8), as TeX is in essence a programming language, and this module does not implement TeX. I use this module to encode author names in BibTeX and to do a rough job at presenting LaTeX abstracts in HTML. Using decode rather than seeing $sqrt{Omega^2zeta_n}$ you get something that looks like the formula. The next logical step for this module is to integrate some level of TeX grammar to improve the decoding, in particular to handle fractions and font changes (which should probably be dropped). METHODS
TeX::Encode::encode STRING [, CHECK] Encodes a utf8 string into TeX. CHECK isn't implemented. TeX::Encode::decode STRING [, CHECK] Decodes a TeX string into utf8. CHECK isn't implemented. TeX::Encode::perlio_ok Returns 0. PerlIO isn't implemented. CAVEATS
Proper Encode checking is not implemented. LaTeX comments (% ...) are ignored because chopping a lot of text may not be what you actually want. encode() Converts non-ASCII Unicode characters to their equivalent TeX symbols (unTeXable characters will result in undef warnings). decode() Attempts to convert TeX symbols (e.g. ae) to Unicode characters. As an experimental feature this also handles Math-mode TeX by inserting HTML into the resulting string (so you end up with an HTML approximation of the maths - NOT MathML). SEE ALSO
Encode::Encoding, Pod::LaTeX, Encode AUTHOR
Timothy D Brody, <tdb01r@ecs.soton.ac.uk> COPYRIGHT AND LICENSE
Copyright (C) 2005-2007 by Timothy D Brody This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available. perl v5.12.4 2011-09-21 TeX::Encode(3pm)

Check Out this Related Man Page

Encode::JP(3pm) 					 Perl Programmers Reference Guide					   Encode::JP(3pm)

NAME
Encode::JP - Japanese Encodings SYNOPSIS
use Encode qw/encode decode/; $euc_jp = encode("euc-jp", $utf8); # loads Encode::JP implicitly $utf8 = decode("euc-jp", $euc_jp); # ditto ABSTRACT
This module implements Japanese charset encodings. Encodings supported are as follows. Canonical Alias Description -------------------------------------------------------------------- euc-jp /euc.*jp$/i EUC (Extended Unix Character) /jp.*euc/i /ujis$/i shiftjis /shift.*jis$/i Shift JIS (aka MS Kanji) /sjis$/i 7bit-jis /jis$/i 7bit JIS iso-2022-jp ISO-2022-JP [RFC1468] = 7bit JIS with all Halfwidth Kana converted to Fullwidth iso-2022-jp-1 ISO-2022-JP-1 [RFC2237] = ISO-2022-JP with JIS X 0212-1990 support. See below MacJapanese Shift JIS + Apple vendor mappings cp932 /windows-31j$/i Code Page 932 = Shift JIS + MS/IBM vendor mappings jis0201-raw JIS0201, raw format jis0208-raw JIS0201, raw format jis0212-raw JIS0201, raw format -------------------------------------------------------------------- DESCRIPTION
To find out how to use this module in detail, see Encode. Note on ISO-2022-JP(-1)? ISO-2022-JP-1 (RFC2237) is a superset of ISO-2022-JP (RFC1468) which adds support for JIS X 0212-1990. That means you can use the same code to decode to utf8 but not vice versa. $utf8 = decode('iso-2022-jp-1', $stream); and $utf8 = decode('iso-2022-jp', $stream); yield the same result but $with_0212 = encode('iso-2022-jp-1', $utf8); is now different from $without_0212 = encode('iso-2022-jp', $utf8 ); In the latter case, characters that map to 0212 are first converted to U+3013 (0xA2AE in EUC-JP; a white square also known as 'Tofu' or 'geta mark') then fed to the decoding engine. U+FFFD is not used, in order to preserve text layout as much as possible. BUGS
The ASCII region (0x00-0x7f) is preserved for all encodings, even though this conflicts with mappings by the Unicode Consortium. SEE ALSO
Encode perl v5.18.2 2013-11-04 Encode::JP(3pm)
Man Page