Sponsored Content
Top Forums Shell Programming and Scripting Multibyte characters to ASCII Post 89494 by vino on Monday 14th of November 2005 12:33:00 AM
Old 11-14-2005
You want the conversion to be done within a script ?

Why do you want to do that ? Those things are usually done within the application.

If your intention is to just view those ZH*/JA locale characters, then you might as well change the locale information on your machine and then view those files. Instead of the machine, the locale change can be done to your xterm itself.

Very specifically, LC_ALL and LANG variables.

It would go something like this

LC_ALL=ja_JP.euc_jp
LANG=ja_JP.euc_jp

Vino
 

10 More Discussions You Might Find Interesting

1. Programming

stupid question about ascii characters

i know it's out there, but I cannot remember how to check if a given ascii character string contains all digits or not ... any ideas? ie...function("123") --> OK function("NOT_A_NUMBER") --> returns error thanks!! (2 Replies)
Discussion started by: jalburger
2 Replies

2. Shell Programming and Scripting

Replace characters in a string using their ascii value

Hi All, In the HP Unix that i'm using when i initialise a string as Stalled="'30¬G'" Stalled=$Stalled" '30¬C'", it is taking the character ¬ as a comma. I need to grep for 30¬G 30¬C in a file and take its count. But since this character ¬ is not being understood, the count returns a zero. The... (2 Replies)
Discussion started by: roops
2 Replies

3. Shell Programming and Scripting

PHP: preg_match_all with multibyte characters?

Hi! I'm trying to separate text into sentences, like this: $pattern = "/(|]|,)**/"; preg_match_all($pattern, $text, $matches); This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters? (2 Replies)
Discussion started by: Ilja
2 Replies

4. Shell Programming and Scripting

convert ascii values into ascii characters

Hi gurus, I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Discussion started by: sandeeppvk
10 Replies

5. Shell Programming and Scripting

New line characters in Ascii file

I am having a file(1234.txt) downloaded from windows server (in Ascii format).However when i ftp this file to Unix server and try to work with it..i am unable to do anything.When i try to open the file using vi editor the file opens in the following format ... @ @ @ @ @ @ @ @... (4 Replies)
Discussion started by: appu2176
4 Replies

6. Shell Programming and Scripting

PHP: preg_match_all with multibyte characters?

Hi! I'm trying to separate text into sentences, like this: $pattern = "/(|]|,)**/"; preg_match_all($pattern, $text, $matches); This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters? An example phrase that doesn't match:... (1 Reply)
Discussion started by: Ilja
1 Replies

7. Shell Programming and Scripting

Removing these non-ASCII characters from a file

Hi, I have many text files which contain some non-ASCII characters. I attach the screenshots of one of the files for people to have a look at. The issue is even after issuing the non-ASCII removal commands one of the characters does not go away. The character that goes away is the black one with a... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

8. Shell Programming and Scripting

Grep to remove non-ASCII characters

I have been having an encoding problem that I need to solve. I have an 4-column tab-separated file: I need to remove all of the lines that contain the string 'vis-à-vis' achiever-n vis-à-vis+ns-j+vp oppose-v 1 achiever-n vis-à-vis+ns-the+vg assess-v 1 administrator-n ... (4 Replies)
Discussion started by: owwow14
4 Replies

9. Shell Programming and Scripting

Positional insertion for multibyte characters

Hi I have a requirement to insert a dot "." after a position in each line, say 110th position. For which, I have written the below command. cat filename | sed 's/./&\./110' > new_filename The code is working fine, but when we have multi byte (2 or 3) characters in the input file, the... (3 Replies)
Discussion started by: tostay2003
3 Replies

10. UNIX for Beginners Questions & Answers

Lower ASCII characters.

Hi, I'm writing a BBS telnet program. I'm having issues with it not displaying lower ASCII characters. For example, instead of displaying the "smiley face" character (Ctrl-B), it displays ^B. Is this because i'm using Ncurses? If so, is there any way around this? Thanks. (3 Replies)
Discussion started by: ignatius
3 Replies
uxterm(1)						      General Commands Manual							 uxterm(1)

NAME
uxterm - X terminal emulator for Unicode (UTF-8) environments SYNOPSIS
uxterm [ xterm-options ] DESCRIPTION
uxterm is a wrapper around the xterm(1) program that invokes the latter program with the 'UXTerm' X resource class set. All arguments to uxterm are passed to xterm without processing; the -class and -u8 options should not be specified because they are used by the wrapper. See the xterm manual page for more information on xterm-options. The environment's locale settings (see "ENVIRONMENT" below) are used to discern the locale's character set. If no current locale can be determined, the locale 'en_US' (the English language as used in the territory of the United States) is assumed. The locale(1) utility is used to determine whether the system supports the selected locale. If it does not, uxterm will exit with an error and report the output of locale. Note: uxterm may produce unexpected results if the current locale is set to one in which the UTF-8 character encoding is not supported, or if fonts using the ISO 10646-1 character set are not available. In the Debian system, the 'xfonts-base' package provides the fonts that uxterm uses by default. To change the fonts uxterm uses, edit the /etc/X11/app-defaults/UXTerm file. A similar wrapper, koi8rxterm(1), is available for KOI8-R environments. ENVIRONMENT
LC_ALL, LC_CTYPE, LANG The values of these variables are checked, in order, to determine the character set used by the current locale. AUTHOR
Thomas Dickey SEE ALSO
locale(1), locale(7), koi8rxterm(1), xterm(1) Debian Project 2004-12-19 uxterm(1)
All times are GMT -4. The time now is 05:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy