In another thread (field separator in Perl) I nearly solved my sorting problem and I finally understood the Schwartzian transform especially thank to KevinADC. After that I've found out that the sorting was not done the way I need it. I did not notice it at first because I used all vowels as a test, but if I put consonants then I see the problem. In fact, the š (U0161) was sorted as expected, but not the ū (U016B), because I need this last to be put as it were a separate letter, after all the "normal' 'u'.
I've tried to change the script to this:
I thought it would work, but from this file:
I get this result:
but I need this:
What am I doing wrong?
I have a stream of characters like "\u8BBE\u5907\u7BA1"
and i want to display it.
I tried following things already without any luck.
1) printf("%s",L("\u8BBE\u5907\u7BA1"));
2) printf("%lc",0x8BBE);
3) setlocale followed by fwide followed by wprintf
4) also changed the local manually... (3 Replies)
Hi,
Can I know how to grep for lines with non-ascii characters in a file?
If not grep, at least can we do it with command-line perl or awk? I tried the functionality of perl, but still could not get the result. Any help??
PS: I was sure that someone should have asked this question... (9 Replies)
Hello,
I am not a programmer, please be patient.
Actually, I have started to look into Perl because it seems to be able to solve all the problems (or most of them) I happen meet using my computer. These problems are generally all text-manipulation-related.
Although I started to study, I cannot... (6 Replies)
Hi gurus,
I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Hello,
I was written a cgi with a textarea to save some words from web.
I grab and write words like this:
$cgiparams{'CONTENTS'} =~ s/\r//g;
#$cgiparams{'CONTENTS'} =~ s/á/á/g;
open(TM, ">$editedfilename");
#binmode(TM,... (1 Reply)
Hello,
My Perl script reads input from stdin and prints it out to stdout. After I read input I use BACKSPACE to erase characters. However BACKSPACE does not work with Unicode characters that are multi-bytes. On screen the character is erased but underneath only one byte is deleted instead of all... (3 Replies)
Hello,
When I run this UNIX code without the -t option it gives me the desired results.
The code keeps the record with the greatest datetime based on the key columns.
I sort it first then sort it again with the -u option, that's it.
I need to have a variable to specify an ASCII character... (2 Replies)
Hello,
I have a large file in UTF8 format with around 200 thousand plus strings which have a large number of scripts (code-blocks/code-pages).
I need to extract from the file only the following:
All strings having basic Latin characters: 0021-007E
All strings in the Devanagari range: 0900 to... (3 Replies)
I have a file in my Unix ( SOLARIS ) with EBCDIC format...I want this file to read in ASCII OR unicode...Is it possible with UNIX to convert this file on ASCII OR UNICODE format from EBCDIC format?
I was searching through web and found only conversion table :(
Request Rejected
Below is... (16 Replies)
Hi All,
I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting?
Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies
LEARN ABOUT CENTOS
locale
locale(3pm) Perl Programmers Reference Guide locale(3pm)NAME
locale - Perl pragma to use or avoid POSIX locales for built-in operations
SYNOPSIS
@x = sort @y; # Unicode sorting order
{
use locale;
@x = sort @y; # Locale-defined sorting order
}
@x = sort @y; # Unicode sorting order again
DESCRIPTION
This pragma tells the compiler to enable (or disable) the use of POSIX locales for built-in operations (for example, LC_CTYPE for regular
expressions, LC_COLLATE for string comparison, and LC_NUMERIC for number formatting). Each "use locale" or "no locale" affects statements
to the end of the enclosing BLOCK.
Starting in Perl 5.16, a hybrid mode for this pragma is available,
use locale ':not_characters';
which enables only the portions of locales that don't affect the character set (that is, all except LC_COLLATE and LC_CTYPE). This is
useful when mixing Unicode and locales, including UTF-8 locales.
use locale ':not_characters';
use open ":locale"; # Convert I/O to/from Unicode
use POSIX qw(locale_h); # Import the LC_ALL constant
setlocale(LC_ALL, ""); # Required for the next statement
# to take effect
printf "%.2f
", 12345.67' # Locale-defined formatting
@x = sort @y; # Unicode-defined sorting order.
# (Note that you will get better
# results using Unicode::Collate.)
See perllocale for more detailed information on how Perl supports locales.
perl v5.16.3 2013-03-04 locale(3pm)