07-25-2016
Btw: I tried this myself at the past weekend, and ran into the UTF-8 multibyte problem.
10 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
I would like to arrange /sort filenames ending with suffix like ".00XXXX". where X is a digit. However the order of arrangement is in a text file and is 'harpharzard'. e.g the text file may be like
002345
009807
001145
I wanted to avoid doing this using sql and exporting the text file back to... (4 Replies)
Discussion started by: samudimu
4 Replies
2. Shell Programming and Scripting
I a file with log entries... I want to sort it so that the last line in the file is first and the first line is last..
eg.
Sample file
1
h
a
f
8
6
After sort should look like
6
8
f
a
h
1 (11 Replies)
Discussion started by: frustrated1
11 Replies
3. UNIX for Dummies Questions & Answers
How do you sort text in order using sed? :confused:
For example
01 B D A C
to 01 ABCD (3 Replies)
Discussion started by: evoGage
3 Replies
4. Shell Programming and Scripting
Hi everyone. I have an html file with lines like so:
link href="localFolder/...">
link href="htp://...">
img src="localFolder/...">
img src="htp://...">
I want to remove the links with http in the href and imgs with http in its src. I'm having trouble removing them because there... (4 Replies)
Discussion started by: CowCow339
4 Replies
5. Shell Programming and Scripting
Hi Guys,
I have data like this
HOS05 23/12/2008 10AM
HOS06 15/12/2008 2PM
HOS62 29/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS0 05/12/2008 10AM
MHOS0 20/12/2008 2PM
MHOS0 27/12/2010 2PM
MHOS0 11/12/2008 10AM
MHOS0 30/12/2009... (1 Reply)
Discussion started by: ckarunprakash
1 Replies
6. Shell Programming and Scripting
Hello,
I have a large database of words and would like them sorted in reverse order i.e. from the end up.
An example will make this clear:
I have tried to write a program in Perl which basically takes the string from the end and tries to sort from that end but it does not seem... (5 Replies)
Discussion started by: gimley
5 Replies
7. UNIX for Dummies Questions & Answers
Hi,
I am trying to sort the following file in descending order of its fourth column.
2 1 363828 -2.423225e-03
3 1 363828 4.132763e-03
3 2 363828 8.150133e-03
4 1 363828 4.126890e-03
I use
sort -k4,4g -r input.txt > output.txt ... (1 Reply)
Discussion started by: evelibertine
1 Replies
8. Web Development
Hi Friends,
I have a HTMl file with 10 columns.
I found a script online that can sort any single column in a HTML file.
But, I would like to sort on multiple columns at once.
Could you please show some pointers?
Thanks (6 Replies)
Discussion started by: jacobs.smith
6 Replies
9. UNIX for Dummies Questions & Answers
Hello all,
I have a list of file names in a text document where each file name consists of 4 letters and 3 numbers (for example MACR119). There are 48 file names in the document (they are not in alphabetical or numerical order). I would like to reorder the list of names so that the 48th name is... (3 Replies)
Discussion started by: MDeBiasse
3 Replies
10. Shell Programming and Scripting
Output from zipdiff GNU EAR comparison tool produces output in html divided into three sections "Added, Removed, Changed". I want the output to be sorted by jar or war file.
<html>
<body>
<table>
<tr>
<td class="diffs" colspan="2">Added </td>
</tr>
<tr><td>
<ul>... (5 Replies)
Discussion started by: kchinnam
5 Replies
UTF(6) Games Manual UTF(6)
NAME
UTF, Unicode, ASCII, rune - character set and format
DESCRIPTION
The Plan 9 character set and representation are based on the Unicode Standard and on the ISO multibyte UTF-8 encoding (Universal Character
Set Transformation Format, 8 bits wide). The Unicode Standard represents its characters in 16 bits; UTF-8 represents such values in an
8-bit byte stream. Throughout this manual, UTF-8 is shortened to UTF.
In Plan 9, a rune is a 16-bit quantity representing a Unicode character. Internally, programs may store characters as runes. However, any
external manifestation of textual information, in files or at the interface between programs, uses a machine-independent, byte-stream
encoding called UTF.
UTF is designed so the 7-bit ASCII set (values hexadecimal 00 to 7F), appear only as themselves in the encoding. Runes with values above
7F appear as sequences of two or more bytes with values only from 80 to FF.
The UTF encoding of the Unicode Standard is backward compatible with ASCII: programs presented only with ASCII work on Plan 9 even if not
written to deal with UTF, as do programs that deal with uninterpreted byte streams. However, programs that perform semantic processing on
ASCII graphic characters must convert from UTF to runes in order to work properly with non-ASCII input. See rune(2).
Letting numbers be binary, a rune x is converted to a multibyte UTF sequence as follows:
01. x in [00000000.0bbbbbbb] -> 0bbbbbbb
10. x in [00000bbb.bbbbbbbb] -> 110bbbbb, 10bbbbbb
11. x in [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb
Conversion 01 provides a one-byte sequence that spans the ASCII character set in a compatible way. Conversions 10 and 11 represent higher-
valued characters as sequences of two or three bytes with the high bit set. Plan 9 does not support the 4, 5, and 6 byte sequences pro-
posed by X-Open. When there are multiple ways to encode a value, for example rune 0, the shortest encoding is used.
In the inverse mapping, any sequence except those described above is incorrect and is converted to rune hexadecimal 0080.
FILES
/lib/unicode
table of characters and descriptions, suitable for look(1).
SEE ALSO
ascii(1), tcs(1), rune(2), keyboard(6), The Unicode Standard.
UTF(6)