Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

utf8_encode(3) [php man page]

UTF8_ENCODE(3)								 1							    UTF8_ENCODE(3)

utf8_encode - Encodes an ISO-8859-1 string to UTF-8

SYNOPSIS
string utf8_encode (string $data) DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this: UTF-8 encoding +------+-------------------------------------+---+ |bytes | | | | | | | | | bits | | | | | | | | representation | | | | | | +------+-------------------------------------+---+ | 1 | | | | | | | | | 7 | | | | | | | | 0bbbbbbb | | | | | | | 2 | | | | | | | | | 11 | | | | | | | | 110bbbbb 10bbbbbb | | | | | | | 3 | | | | | | | | | 16 | | | | | | | | 1110bbbb 10bbbbbb 10bbbbbb | | | | | | | 4 | | | | | | | | | 21 | | | | | | | | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | | | | | | +------+-------------------------------------+---+ Each b represents a bit that can be used to store character data. PARAMETERS
o $data - An ISO-8859-1 string. RETURN VALUES
Returns the UTF-8 translation of $data. SEE ALSO
utf8_decode(3). PHP Documentation Group UTF8_ENCODE(3)

Check Out this Related Man Page

MB_DETECT_ORDER(3)							 1							MB_DETECT_ORDER(3)

mb_detect_order - Set/Get character encoding detection order

SYNOPSIS
mixed mb_detect_order ([mixed $encoding_list = mb_detect_order()]) DESCRIPTION
Sets the automatic character encoding detection order to $encoding_list. PARAMETERS
o $encoding_list -$encoding_list is an array or comma separated list of character encoding. See supported encodings. If $encoding_list is omitted, it returns the current character encoding detection order as array. This setting affects mb_detect_encoding(3) and mb_send_mail(3). mbstring currently implements the following encoding detection filters. If there is an invalid byte sequence for the following encodings, encoding detection will fail. UTF-8, UTF-7, ASCII, EUC-JP, SJIS, eucJP-win, SJIS-win, JIS, ISO-2022-JP For ISO-8859-*, mbstring always detects as ISO-8859-*. For UTF-16, UTF-32, UCS2 and UCS4, encoding detection will fail always. RETURN VALUES
Returns TRUE on success or FALSE on failure. EXAMPLES
Example #1 mb_detect_order(3) examples <?php /* Set detection order by enumerated list */ mb_detect_order("eucjp-win,sjis-win,UTF-8"); /* Set detection order by array */ $ary[] = "ASCII"; $ary[] = "JIS"; $ary[] = "EUC-JP"; mb_detect_order($ary); /* Display current detection order */ echo implode(", ", mb_detect_order()); ?> Example #2 Example showing useless detect orders ; Always detect as ISO-8859-1 detect_order = ISO-8859-1, UTF-8 ; Always detect as UTF-8, since ASCII/UTF-7 values are ; valid for UTF-8 detect_order = UTF-8, ASCII, UTF-7 SEE ALSO
mb_internal_encoding(3), mb_http_input(3), mb_http_output(3), mb_send_mail(3). PHP Documentation Group MB_DETECT_ORDER(3)
Man Page