08-18-2011
Quote:
Originally Posted by
chipperuga
Interesting. When I open Notepad in Windows and click on save as, there are 4 options for encoding: ANSI, Unicode, Unicode big endian, UTF-8.
Crosspost.
What Windows calls "Unicode" is UTF-16.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
How do I convert unix text files into readable text for windows.
Dave (1 Reply)
Discussion started by: nucca
1 Replies
2. UNIX for Dummies Questions & Answers
Hi all,
I have a print control file (dflt) for Oracle which is in binary. As I am going to develope an application in Window environment, I would like to reference the dflt file. But it is in binary format and I cannot access it. Anyone can suggest me how to convert the file into text or... (5 Replies)
Discussion started by: user12345
5 Replies
3. UNIX for Dummies Questions & Answers
Hiiiii....
how to convert all *. c files to *.cpp files , in a directory given using shell script.
:pThnaking u.:p (10 Replies)
Discussion started by: krishnampkkm
10 Replies
4. Shell Programming and Scripting
Hi,
Your great help is very appreciated. I am looking for any Unix command or tool for doing Stress/Load test of php files at command prompt.
I tried torture.pl but it is not working after20 concurrent threads/users.
as it is very urgent for me..please suggest ur ideas asap.
thanks (5 Replies)
Discussion started by: Malleswari
5 Replies
5. UNIX for Dummies Questions & Answers
Hello All,
I have a scenario to read a file containing text like this:(say file name is Dummy.txt)
/home/abc/test1/ | file1
/home/abc/test2/ | file2
I used a variable to store the content from file like this (say for line1):
File=`head -1 Dummy.txt | cut -f1 -d "|"`
Dir=`head -1 Dummy.txt... (2 Replies)
Discussion started by: Quesemail
2 Replies
6. UNIX for Dummies Questions & Answers
Hi Unix Gurus,
I need to list all files in a Unix Directory which either end with a
.pdf or .rtf and they should be case insensitive ie .Pdf , .pDF , .RtF etc are also possible.
How can i accomplish this with with a ls command ? If not then a find command. (6 Replies)
Discussion started by: pchegoor
6 Replies
7. Windows & DOS: Issues & Discussions
Hmmm I think I found the correct subforum to ask my question...
I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this?... (1 Reply)
Discussion started by: frys_hp
1 Replies
8. UNIX for Dummies Questions & Answers
Hi all,
I have some text files that I prepared in vi some time ago, and now I want to open and edit them with Windows Notepad. I don't have a Unix terminal at the moment so I need to do the conversion in Windows. Is there a way to do this? Or just reinsert thousands of line breaks again :eek: ? (2 Replies)
Discussion started by: frys_hp
2 Replies
9. Shell Programming and Scripting
Hi
This is my first post and I'm just a beginner. So please be nice to me.
I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file?
I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies
10. Shell Programming and Scripting
Hi all,
Is there other way to Add text at start and ending of every line?
here my script:
cat file.txt |awk '{print "<p align=\"justify\">"$0"</p>"}'
but the problem they put including white spaces, I only need those line have a sentence or text not an skip all have empty string or have... (7 Replies)
Discussion started by: lxdorney
7 Replies
UTF8(5) BSD File Formats Manual UTF8(5)
NAME
utf8 -- UTF-8, a transformation format of ISO 10646
SYNOPSIS
ENCODING "UTF-8"
DESCRIPTION
The UTF-8 encoding represents UCS-4 characters as a sequence of octets, using between 1 and 6 for each character. It is backwards compatible
with ASCII, so 0x00-0x7f refer to the ASCII character set. The multibyte encoding of non-ASCII characters consist entirely of bytes whose
high order bit is set. The actual encoding is represented by the following table:
[0x00000000 - 0x0000007f] [00000000.0bbbbbbb] -> 0bbbbbbb
[0x00000080 - 0x000007ff] [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb
[0x00000800 - 0x0000ffff] [bbbbbbbb.bbbbbbbb] ->
1110bbbb, 10bbbbbb, 10bbbbbb
[0x00010000 - 0x001fffff] [00000000.000bbbbb.bbbbbbbb.bbbbbbbb] ->
11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
[0x00200000 - 0x03ffffff] [000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
[0x04000000 - 0x7fffffff] [0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
If more than a single representation of a value exists (for example, 0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always
used. Longer ones are detected as an error as they pose a potential security risk, and destroy the 1:1 character:octet sequence mapping.
SEE ALSO
euc(5)
Rob Pike and Ken Thompson, "Hello World", Proceedings of the Winter 1993 USENIX Technical Conference, USENIX Association, January 1993.
F. Yergeau, UTF-8, a transformation format of ISO 10646, January 1998, RFC 2279.
The Unicode Standard, Version 3.0, The Unicode Consortium, 2000, as amended by the Unicode Standard Annex #27: Unicode 3.1 and by the Unicode
Standard Annex #28: Unicode 3.2.
STANDARDS
The utf8 encoding is compatible with RFC 2279 and Unicode 3.2.
BSD
April 7, 2004 BSD