Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

utf2(5) [osx man page]

UTF2(5) 						      BSD File Formats Manual							   UTF2(5)

NAME
utf2 -- Universal character set Transformation Format encoding of runes SYNOPSIS
ENCODING "UTF2" DESCRIPTION
The UTF2 encoding has been deprecated in favour of UTF-8. New applications should not use UTF2. The UTF2 encoding is based on a proposed X-Open multibyte FSS-UCS-TF (File System Safe Universal Character Set Transformation Format) encod- ing as used in Plan 9 from Bell Labs. Although it is capable of representing more than 16 bits, the current implementation is limited to 16 bits as defined by the Unicode Standard. UTF2 representation is backwards compatible with ASCII, so 0x00-0x7f refer to the ASCII character set. The multibyte encoding of runes between 0x0080 and 0xffff consist entirely of bytes whose high order bit is set. The actual encoding is represented by the following table: [0x0000 - 0x007f] [00000000.0bbbbbbb] -> 0bbbbbbb [0x0080 - 0x07ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb [0x0800 - 0xffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb If more than a single representation of a value exists (for example, 0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always used (but the longer ones will be correctly decoded). The final three encodings provided by X-Open: [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] -> 11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb which provides for the entire proposed ISO-10646 31 bit standard are currently not implemented. SEE ALSO
mklocale(1), setlocale(3), utf8(5) BSD
October 11, 2002 BSD

Check Out this Related Man Page

UTF2(5) 						      BSD File Formats Manual							   UTF2(5)

NAME
utf2 -- Universal character set Transformation Format encoding of runes SYNOPSIS
ENCODING "UTF2" DESCRIPTION
The UTF2 encoding has been deprecated in favour of UTF-8. New applications should not use UTF2. The UTF2 encoding is based on a proposed X-Open multibyte FSS-UCS-TF (File System Safe Universal Character Set Transformation Format) encod- ing as used in Plan 9 from Bell Labs. Although it is capable of representing more than 16 bits, the current implementation is limited to 16 bits as defined by the Unicode Standard. UTF2 representation is backwards compatible with ASCII, so 0x00-0x7f refer to the ASCII character set. The multibyte encoding of runes between 0x0080 and 0xffff consist entirely of bytes whose high order bit is set. The actual encoding is represented by the following table: [0x0000 - 0x007f] [00000000.0bbbbbbb] -> 0bbbbbbb [0x0080 - 0x07ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb [0x0800 - 0xffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb If more than a single representation of a value exists (for example, 0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always used (but the longer ones will be correctly decoded). The final three encodings provided by X-Open: [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] -> 11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] -> 1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb which provides for the entire proposed ISO-10646 31 bit standard are currently not implemented. SEE ALSO
mklocale(1), setlocale(3), utf8(5) BSD
October 11, 2002 BSD
Man Page

6 More Discussions You Might Find Interesting

1. UNIX Desktop Questions & Answers

delete lines..

hi All, I have a file like this; aaaaaaaaaa bbbbbbbb cccccc ddddddddddd ........ xxxxxxxxxxxx I want to delete hole lines and it must be like this aaaaaaaaaa bbbbbbbb (8 Replies)
Discussion started by: alisevA3
8 Replies

2. UNIX for Dummies Questions & Answers

sort command

i am wondering is it possible to sort a file by day of the week for example if the file is name oncall-date phone number john wednsday 00000000 jane tuesday 00000000 alice monday 00000000 kevin thursday 00000000 can it be sorted... (1 Reply)
Discussion started by: melkor
1 Replies

3. UNIX for Dummies Questions & Answers

How to select line by line in shell

Hi, This is what the file data.lst contains. aaaaaaaa eeeeeeeeeeeeeeee 4444444 rrrrrrrrrrrrr tttttttttttt bbbbbbbb eeeeeeeeeeeeeeee 7777777 uuuuuuuu eeeeeeeee qqqqqqqq gggggggggggggggg 6666666 oooooooo ppppppppp Here I want to cut the third field and put it in a new file ... (15 Replies)
Discussion started by: preethgideon
15 Replies

4. Shell Programming and Scripting

Parse data

hi i have a file p1.htm <div class="colorID2"> aaaa aaaa aa <br/> bbbbbbbb bbb<br/> <br/>cccc ccc ccc </div><div class="colorID1"> dddd d ddddd<br/> eeee eeee eeeeeeeeee<br/> fffff <br/>g gg<br/> (5 Replies)
Discussion started by: saw7
5 Replies

5. Shell Programming and Scripting

Regarding multiline record searching with specific pattern

Dear Experts, I need to extract specific records from one file which has multiline records. Input file pattern is: ============ aaaaaaaa bbbbbbbb asdf 1234 cccccccc dddddddd ============ aaaaaaaa bbbbbbbb qwer 2345 cccccccc dddddddd (7 Replies)
Discussion started by: dhiraj4mann
7 Replies

6. Shell Programming and Scripting

Delete Lines : after pattern1 and between pattern2 and pattern3 using awk/sed/perl

Hi I need to delete lines from a file which are after pattern1 and between pattern 2 and patter3, as below: aaaaaaaa bbbbbbbb pattern1 cdededed ddededed pattern2 fefefefe <-----Delete this line efefefef <-----Delete this line pattern3 adsffdsd huaserew Please can you suggest... (6 Replies)
Discussion started by: vk2012
6 Replies