Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

utf8_encode(3) [php man page]

UTF8_ENCODE(3)								 1							    UTF8_ENCODE(3)

utf8_encode - Encodes an ISO-8859-1 string to UTF-8

SYNOPSIS
string utf8_encode (string $data) DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this: UTF-8 encoding +------+-------------------------------------+---+ |bytes | | | | | | | | | bits | | | | | | | | representation | | | | | | +------+-------------------------------------+---+ | 1 | | | | | | | | | 7 | | | | | | | | 0bbbbbbb | | | | | | | 2 | | | | | | | | | 11 | | | | | | | | 110bbbbb 10bbbbbb | | | | | | | 3 | | | | | | | | | 16 | | | | | | | | 1110bbbb 10bbbbbb 10bbbbbb | | | | | | | 4 | | | | | | | | | 21 | | | | | | | | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | | | | | | +------+-------------------------------------+---+ Each b represents a bit that can be used to store character data. PARAMETERS
o $data - An ISO-8859-1 string. RETURN VALUES
Returns the UTF-8 translation of $data. SEE ALSO
utf8_decode(3). PHP Documentation Group UTF8_ENCODE(3)

Check Out this Related Man Page

SQLITE_LIBENCODING(3)													     SQLITE_LIBENCODING(3)

sqlite_libencoding - Returns the encoding of the linked SQLite library

SYNOPSIS
string sqlite_libencoding (void ) DESCRIPTION
The SQLite library may be compiled in either ISO-8859-1 or UTF-8 compatible modes. This function allows you to determine which encoding scheme is used by your version of the library. Warning The default PHP distribution builds libsqlite in ISO-8859-1 encoding mode. However, this is a misnomer; rather than handling ISO-8859-1, it operates according to your current locale settings for string comparisons and sort ordering. So, rather than ISO-8859-1, you should think of it as being ' 8-bit' instead. When compiled with UTF-8 support, sqlite handles encoding and decoding of UTF-8 multi-byte character sequences, but does not yet do a com- plete job when working with the data (no normalization is performed for example), and some comparison operations may still not be carried out correctly. Warning It is not recommended that you use PHP in a web-server configuration with a version of the SQLite library compiled with UTF-8 sup- port, since libsqlite will abort the process if it detects a problem with the UTF-8 encoding. RETURN VALUES
Returns the library encoding. SEE ALSO
sqlite_lib_version(3). PHP Documentation Group SQLITE_LIBENCODING(3)
Man Page

15 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

if and sed statement

this is my output for my crawler. /about.html /ads/ /advanced_search?hl=en froogle.google.com/frghp?hl=en&tab=wf&ie=UTF-8 groups.google.com/grphp?hl=en&tab=wg&ie=UTF-8 /imghp?hl=en&tab=wi&ie=UTF-8 /intl/en/options/ /language_tools?hl=en /maphp?hl=en&tab=wl&ie=UTF-8... (3 Replies)
Discussion started by: chris1234
3 Replies

2. UNIX for Advanced & Expert Users

Data_DIR=${0%/*/*}/data/

Please explain both these lines Data_DIR=${0%/*/*}/data/ if ] what is the meaning of ${0%/*/*}/data/ and what is the meaning of ] (2 Replies)
Discussion started by: satgur
2 Replies

3. Shell Programming and Scripting

Character Sets

Hi I was just wondering if there was a way in which i could find out the character set used in a file in HP-UX. ie Whether it is Unicode, UTF-8,ascii etc. Regards (3 Replies)
Discussion started by: PradeepRed
3 Replies

4. Programming

strlen for UTF-8

My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes. int main(void) { setlocale(LC_ALL, "en_US.UTF-8"); printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n",... (8 Replies)
Discussion started by: cyler
8 Replies

5. Shell Programming and Scripting

To replace a keyword for a number of files in a path

I have to search for a keyword (UTF-16) in xml file and if the keyword is found i have to convert the encoding type in the file to UTF-8 and then replace the keyword inside the file from UTF-16 to UTF-8. I have the code which is working for one file but for number of files in a path its not... (2 Replies)
Discussion started by: Shruthi8818
2 Replies

6. Shell Programming and Scripting

ASCII to UTF-8 conversion

I Am trying to change the file encoding from ASCII to UTF-8 using below command iconv -f ASCII -t UTF-8 <input_file> > <output_file> But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII. While converting am not... (5 Replies)
Discussion started by: Sriranga
5 Replies

7. UNIX for Dummies Questions & Answers

Filetype conversion error (showing ascii instead of data)

Hi I have spool file data UTF file containing Header data footer when there is data my file type is UTF FORMAT. that is typing file file1.utf output is data but when there is no records of data if only shows Header and footer then the flletype is ASCII... why this happen? Pls... (2 Replies)
Discussion started by: shenthil76
2 Replies

8. Shell Programming and Scripting

How to check string encoding?

I want to check if the string is WINDOWS-1251 or UTF-8 can you help me to find the string encoding??? or maybe to get URL Content-Type charset with wget? this is my function on PHP function check_utf8($str) { $len = strlen($str); for($i = 0; $i < $len; $i++){ $c =... (2 Replies)
Discussion started by: sanantonio7777
2 Replies

9. Shell Programming and Scripting

Sed: how do I insert a \ in my replace

I'm in the process of being forward-thinking and finally converting my site's db to UTF-8. I've already done the UTF-8 conversion (on a copy for testing) and now I want to go through and convert html entities to their actual characters. I ran an entity decode on a mysqldump file but realized... (10 Replies)
Discussion started by: dheian
10 Replies

10. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

11. Shell Programming and Scripting

Copying a file with UTF char on UNIX server

Hi, I need to run a SQL which check for special UTF char in DB. When I try to copy that in UNIX file it changes it to some wierd chat. How can in retain the UTF chars in my script? e.g. ο|π|ρ|σ|τ|υ|φ|χ|ψ Any help will be appriciated. Thanks, (14 Replies)
Discussion started by: varun22486
14 Replies

12. UNIX for Advanced & Expert Users

UTF-8,16,32 character lengths using awk

Hi All, I am trying to obtain count of characters using awk, but "length" function returns a value of 1 for 2-byte or 3-byte characters as well unlike wc -c command. I have tried to use the below commands within awk function, but it does not seem to work { cmd="wc -c "stringtocheck ( cmd )... (6 Replies)
Discussion started by: tostay2003
6 Replies

13. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

14. Shell Programming and Scripting

Create .nfo file in ISO-8859-1 or UTF-8

Hey guys, I have a little problem, Let's say I create this script : #!/bin/sh nfo_file="/home/admin/info.nfo" echo "▒▒█ Hello █▒▒" > $nfo_fileIt seems to be okay : cat /home/admin/info.nfo ▒▒█ Hello █▒▒file -bi /home/admin/info.nfo text/plain; charset=utf-8But when I open it in a... (7 Replies)
Discussion started by: antoinelomb
7 Replies

15. Shell Programming and Scripting

Writing umlauts to a file

Hello all, I have a strange Problem with writing umlauts like (ä, ü) to a file, which has an ISO-8859-1 Encoding. My Shell-script is reading a file. The Encoding differs. Sometimes US-ASCII, UTF-8, ISO-8859-1. Then a I have to replace all "{" with a "ä". I am reading the file line by line... (3 Replies)
Discussion started by: API
3 Replies