Sponsored Content
Full Discussion: UTF-8 in xterm
Top Forums UNIX for Dummies Questions & Answers UTF-8 in xterm Post 302550042 by mregine on Wednesday 24th of August 2011 12:13:58 PM
Old 08-24-2011
UTF-8 in xterm

I need to use sort, uniq, grep, wc,... and the like to work with lists of words in UTF-8 (the "words" being phonetic transcriptions using the IPA). I have been using Google a lot and I even found at least one previous post on this topic, but it didn't help.

I tried following the instructions on:
UTF-8 and Unicode FAQ
* I set the locale in my xterm with
Code:
export LC_ALL=fr_FR.UTF-8

(which is installed as per locale -a)
* Then I started a new xterm from within the old one with
Code:
xterm -fn '-adobe-courier-medium-r-normal--10-100-75-75-m-60-iso10646-1'


which I found using
Code:
xlsfonts | grep iso10646-1 | less

* Then I tested using some of the example files found on
UTF-8 and Unicode FAQ

Unfortunately, the unicode characters are displayed as boxes when viewing the file with less (after typing a "y" in answer to the message warning me that "UTF-8-demo.txt may be a binary file...")

I also tried setting LESSCHARSET=utf-8, but it didn't help either.

Can anyone help?

I am using the latest version of X11.app on Mac OS X (XQuartz 2.6.3). less is version 394, xterm version 269.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

UTF 8 and SED

Collegues I tried to manipulate a UTF 8 data using the following script. cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g' But it says that cnot exicute binary file. Any solution. Jaganadh. Linguist (1 Reply)
Discussion started by: jaganadh
1 Replies

2. Shell Programming and Scripting

replace UTF-8 characters with tr

Hi, I try to get tr to replace multibytes characters by ascii equivalent. For example "Je vais ŕ l'école" ---> 'Je vais a l'ecole" But my version of tr (5.97) doesn't seem to support multibyte sets. $ locale charmap; echo "Je vais ŕ l'école" | tr éŕ ea UTF-8 Je vais aa l'aacole I try to... (2 Replies)
Discussion started by: ripat
2 Replies

3. AIX

en_us.utf-8

please someone provide me the link for downloading en_us.utf-8 .....i have an issue with locale for which i need this :( (1 Reply)
Discussion started by: shubhendu.pyne
1 Replies

4. UNIX Desktop Questions & Answers

How to configure Xterm for UTF-8?

hmmm... I was not sure where to post this! I want emit non-ascii chinese and ciryllic text. I'm running windows server 2003 with cygwin xfree86. I know I have one font that can render chinese and russian: "Arial Unicode MS". How can I configure my cygwin xterm so I can emit russian and... (1 Reply)
Discussion started by: siegfried
1 Replies

5. UNIX for Advanced & Expert Users

vi and UTF-8 errors

We just installed icu for UTF-8 compliance on our AIX 5.3 system. While usuing vi on some files we get the following error: ex: 0602-169 Incomplete or invalid multibyte character encountere yte character encountered, conversion failed.ex: 0602-169 Incomplete or invalidb ractersultibyte... (0 Replies)
Discussion started by: jlacasci
0 Replies

6. Programming

strlen for UTF-8

My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes. int main(void) { setlocale(LC_ALL, "en_US.UTF-8"); printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n",... (8 Replies)
Discussion started by: cyler
8 Replies

7. AIX

How to print UTF-8 from AIX (lp)

Hello everyone! I have a problem with printing ru_RU.UTF-8 from AIX using lp command. #locale -a C POSIX RU_RU.UTF-8 RU_RU en_US.8859-15 en_US.ISO8859-1 en_US ru_RU.ISO8859-5 ru_RU #locale LANG=en_US.UTF-8 LC_COLLATE=RU_RU.UTF-8 LC_CTYPE=RU_RU.UTF-8 LC_MONETARY="en_US" (3 Replies)
Discussion started by: burnAF
3 Replies

8. Shell Programming and Scripting

ASCII to UTF-8 conversion

I Am trying to change the file encoding from ASCII to UTF-8 using below command iconv -f ASCII -t UTF-8 <input_file> > <output_file> But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII. While converting am not... (5 Replies)
Discussion started by: Sriranga
5 Replies

9. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

10. UNIX for Dummies Questions & Answers

Conversion from ansii to UTF 16

Hi I have a big file which is in ansii . I want to convert it to UTF-16 .Please help me on this as I am stuck at this point in unix . (8 Replies)
Discussion started by: harry00514
8 Replies
PERLJP(1)						 Perl Programmers Reference Guide						 PERLJP(1)

NAME
perljp - XXX Perl XXX XX
Perl XXXXXXXX! Perl 5.8.0 XXXUnicodeXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX CJK (XXXXXXXXXXXX)XXXXXXXXXXXUnicodeXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXOSXXXXXXXXXXXXXXX(PCXXXMacintosh)XXXXXXXXXXXX Perl XXX Unicode XXXXXXXPerl XXXXXXXXXXXXXXXXXXXX Unicode XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Encode XXXXXXXXXXXXUnicode XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXX Encode XXXXXXXXXXXXXXXXXXXXXX 7bit-jis AdobeStandardEncoding AdobeSymbol AdobeZdingbat ascii big5 big5-hkscs cp1006 cp1026 cp1047 cp1250 cp1251 cp1252 cp1253 cp1254 cp1255 cp1256 cp1257 cp1258 cp37 cp424 cp437 cp500 cp737 cp775 cp850 cp852 cp855 cp856 cp857 cp860 cp861 cp862 cp863 cp864 cp865 cp866 cp869 cp874 cp875 cp932 cp936 cp949 cp950 dingbats euc-cn euc-jp euc-kr gb12345-raw gb2312-raw gsm0338 hp-roman8 hz iso-2022-jp iso-2022-jp-1 iso-8859-1 iso-8859-10 iso-8859-11 iso-8859-13 iso-8859-14 iso-8859-15 iso-8859-16 iso-8859-2 iso-8859-3 iso-8859-4 iso-8859-5 iso-8859-6 iso-8859-7 iso-8859-8 iso-8859-9 iso-ir-165 jis0201-raw jis0208-raw jis0212-raw johab koi8-f koi8-r koi8-u ksc5601-raw MacArabic MacCentralEurRoman MacChineseSimp MacChineseTrad MacCroatian MacCyrillic MacDingbats MacFarsi MacGreek MacHebrew MacIcelandic MacJapanese MacKorean MacRoman MacRomanian MacRumanian MacSami MacSymbol MacThai MacTurkish MacUkrainian nextstep posix-bc shiftjis symbol UCS-2BE UCS-2LE UTF-16 UTF-16BE UTF-16LE UTF-32 UTF-32BE UTF-32LE utf8 viscii (X114XX) XXXXXXXXXFOOXXXXXXUTF-8XXXXXXXXXXXXXXXXXX perl -Mencoding=FOO,STDOUT,utf8 -pe1 < file.FOO > file.utf8 XXXPerlXXXXXXPerlXXXXXXXXXXXXXXXXXXXXpiconvXXXXXXXXXXXXXXXXXXXXXXXXXX piconv -f FOO -t utf8 < file.FOO > file.utf8 piconv -f utf8 -t FOO < file.utf8 > file.FOO About (jcode.pl|Jcode.pm|JPerl) 5.8XXXXXXXXXXEUC-JPXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXJcode.pmX( http://openlab.ring.gr.jp/Jcode/ )Xperl4XXXXXXXXXXXXXjcode.plXXXXXXXXXXXXXXXXCGIXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 5.005XXXPerlXXXXXXXXXXXXXXXXXXXJperlXXXXXXX( http://homepage2.nifty.com/kipp/perl/jperl/index.html )XXXXMac OS 9.x/ClassicXXPerlXMacPerlXXXXXXMacJPerlXXXXXXXXXXX( http://habilis.net/macjperl/ ).XXXXXXXXXXXXXEUC-JPXXXShift_JISXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Perl5.8XXXXXXXXXXXXXPerlXXXXXXXXXXXXXXXXXXXXXXX114XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXCPANXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX o XXX XXXXXXXXXShift_JISXXXXEUC-JPXXXXXXXXXXX # jcode.pl require "jcode.pl"; while(<>){ jcode::convert(*_, 'euc', 'sjis'); print; } # Jcode.pm use Jcode; while(<>){ print Jcode->new($_, 'sjis')->euc; } # Perl 5.8 use Encode; while(<>){ from_to($_, 'shiftjis', 'euc-jp'); print; } # Perl 5.8 - encoding XXXXX use encoding 'euc-jp', STDIN => 'shiftjis'; while(<>){ print; } o Jperl XXXXXXX XXXX"shebang"XXXXXXXXXJperlXXscriptXXXXXXXXXXXXXXXXXXXXXXX #!/path/to/jperl X #!/path/to/perl -Mencoding=euc-jp XXXX perldoc encoding XXXXXXXXXX XXXXXX PerlXXXXXXXXXXXXXXXPerlXXXXXUnicodeXXXXXXXXEncodeXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX perldoc perlunicode # PerlXUnicodeXXXXXX perldoc Encode # EncodeXXXXXXXXX perldoc Encode::JP # XXXXXXXXXXXXXX PerlXXXXXX URL <http://www.perl.com/> Perl XXXXXX (O'Reilly and Associates) <http://www.cpan.org/> CPAN (Comprehensive Perl Archive Network) <http://lists.perl.org/> Perl XXXXXXXXX PerlXXXXXXX URL <http://www.oreilly.com.tw/> O'Reilly XXPerlXXXX(XXXXXX) <http://www.oreilly.com.cn/> O'Reilly XXPerlXXXX(XXXXXX) <http://www.oreilly.co.jp/catalog/> XXXXXXXPerlXXXX(XXX) Perl XXXXXXXX <http://www.pm.org/groups/asia.html> UnicodeXXXURL <http://www.unicode.org/> Unicode XXXXXXX (UnicodeXXXXXXX) <http://www.cl.cam.ac.uk/%7Emgk25/unicode.html> UTF-8 and Unicode FAQ for Unix/Linux http://wiki.kldp.org/Translations/html/UTF8-Unicode-KLDP/UTF8-Unicode-KLDP.html <http://wiki.kldp.org/Translations/html/UTF8-Unicode- KLDP/UTF8-Unicode-KLDP.html> UTF-8 and Unicode FAQ for Unix/Linux (XXXXX) AUTHORS
Jarkko Hietaniemi <jhi@iki.fi> Dan Kogai (XXXX) <dankogai@dan.co.jp> perl v5.14.2 2011-09-19 PERLJP(1)
All times are GMT -4. The time now is 08:35 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy