03-30-2011
Sybase says en_GB is like Sybase iso_1:
New Functionality in Adaptive Server Enterprise Version 12.5.x
and similar to iso 8859-1, often called or similar to latin-1, an one-character code page or font like ASCII but with the upper 128 loaded with western european support, umlau-A, diaresis-o and such.
UTF-8 is an variable width encoding of wide unicode such that the ASCII characters match in one character.
So, until the high bit comes on, you are pretty good, perhaps give or take a few symbol shifts like # versus pound-sterling L with slashes.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I need to split a string, either using awk or cut or basic unix commands (no programming) , with a multibyte charectar as a delimeter.
Ex:
abcd-efgh-ijkl
split by -efgh- to get two segments abcd & ijkl
Is it possible?
Thanks
A.H.S (1 Reply)
Discussion started by: azmathshaikh
1 Replies
2. Shell Programming and Scripting
Hello,
Is there any UNIX utility/command/executable that will convert mutlibyte characters to standard single byte ASCII characters in a given file?
and
Is there any UNIX utility/command/executable that will recognize multibyte characters in a given file name?
The typical multibyte... (8 Replies)
Discussion started by: jerardfjay
8 Replies
3. Shell Programming and Scripting
Hi! I'm trying to separate text into sentences, like this:
$pattern = "/(|]|,)**/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters? (2 Replies)
Discussion started by: Ilja
2 Replies
4. Shell Programming and Scripting
Hi! I'm trying to separate text into sentences, like this:
$pattern = "/(|]|,)**/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters?
An example phrase that doesn't match:... (1 Reply)
Discussion started by: Ilja
1 Replies
5. Shell Programming and Scripting
Hi,
I have one doubt over sftp.
I am trnasferring a file from server1 to server2 using sftp.
The size of the file shows different in file 1 and file2 after sftp even though it shows same number of byte transferred. I don't understand the problem.
For example:
I have file1 having size... (3 Replies)
Discussion started by: siba.s.nayak
3 Replies
6. Programming
Compiling xpp (The X Printing Panel) on SL6 (RHEL6 essentially):
xpp.cxx: In constructor ‘printFiles::printFiles(int, char**, int&)’:
xpp.cxx:200: error: invalid conversion from ‘const char*’ to ‘char*’
The same error with all c++ constructors - gcc 4.4.4.
If anyone can throw any light on... (8 Replies)
Discussion started by: GSO
8 Replies
7. Shell Programming and Scripting
Hello all
Can anyone help me to solve the below issue
I want to take user input with space separated .The number of inputs can be variable
like if user inputs
1 2 3 4
ouput will stored in as array a where i=4 and I can retreive the value like a =3
any thoughts how to do it
... (2 Replies)
Discussion started by: Pratik4891
2 Replies
8. Shell Programming and Scripting
Hello,
I have two directories - prev and current . They both have same multiple subdirectories and files. Now the current directory can have some updated files and some new files added that is not in prev. I want to find the list of file names that differ. I am doing this because i can not... (2 Replies)
Discussion started by: jakSun8
2 Replies
9. Shell Programming and Scripting
I am using Solaris8, userA's shell '/usr/ace/prog/sdshell', AppuserB's shell '/bin/ksh'.
serverT:/home/userA>LC_CTYPE=iso_8859_1; export LC_CTYPE; vtemp='userA variable'; export vtemp
serverT:/home/userA>echo "LC_CTYPE=$LC_CTYPE, vtemp=$vtemp";
LC_CTYPE=iso_8859_1, vtemp=userA... (4 Replies)
Discussion started by: kchinnam
4 Replies
10. Shell Programming and Scripting
Hi
I have a requirement to insert a dot "." after a position in each line, say 110th position.
For which, I have written the below command.
cat filename | sed 's/./&\./110' > new_filename
The code is working fine, but when we have multi byte (2 or 3) characters in the input file, the... (3 Replies)
Discussion started by: tostay2003
3 Replies
LEARN ABOUT PHP
utf8_encode
UTF8_ENCODE(3) 1 UTF8_ENCODE(3)
utf8_encode - Encodes an ISO-8859-1 string to UTF-8
SYNOPSIS
string utf8_encode (string $data)
DESCRIPTION
This function encodes the string $data to UTF-8, and returns the encoded version. UTF-8 is a standard mechanism used by Unicode for
encoding wide character values into a byte stream. UTF-8 is transparent to plain ASCII characters, is self-synchronized (meaning it is
possible for a program to figure out where in the bytestream characters start) and can be used with normal string comparison functions for
sorting and such. PHP encodes UTF-8 characters in up to four bytes, like this:
UTF-8 encoding
+------+-------------------------------------+---+
|bytes | | |
| | | |
| | bits | |
| | | |
| | representation | |
| | | |
+------+-------------------------------------+---+
| 1 | | |
| | | |
| | 7 | |
| | | |
| | 0bbbbbbb | |
| | | |
| 2 | | |
| | | |
| | 11 | |
| | | |
| | 110bbbbb 10bbbbbb | |
| | | |
| 3 | | |
| | | |
| | 16 | |
| | | |
| | 1110bbbb 10bbbbbb 10bbbbbb | |
| | | |
| 4 | | |
| | | |
| | 21 | |
| | | |
| | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb | |
| | | |
+------+-------------------------------------+---+
Each b represents a bit that can be used to store character data.
PARAMETERS
o $data
- An ISO-8859-1 string.
RETURN VALUES
Returns the UTF-8 translation of $data.
SEE ALSO
utf8_decode(3).
PHP Documentation Group UTF8_ENCODE(3)