Sponsored Content
Top Forums UNIX for Dummies Questions & Answers shell: reconcile language and sort behaviour Post 302406863 by jossojjos on Wednesday 24th of March 2010 04:50:37 AM
Old 03-24-2010
shell: reconcile language and sort behaviour

Hi

Don't know if this is a dummy question, but let's give it a try.

I yesterday had a problem with undefined behaviour in the sort shell command (I'm using bash), leading to different sort orders without apparent reasons. I resolved this by typing

Code:
export LC_ALL="C"
export LC_COLLATE="C"
export LC_CTYPE="C"

and adding this to my .bash_profile as well.

The language is still set to english :

Code:
[jos@faba ~]$ echo $LANG
en_US.UTF-8

But now, the shell doesn't recognize special characters anymore, like accented ones (I've lots of them : although my system is in english, I'm in France so I've lots of accents in my filenames).
Even in english text, non-recognized characters do appear (for instance "man rpm" gives "<E2><80><99>" characters).

My question : is it "either-or", i.e. either correct sorting behaviour or correct character handling, or is there a way to have both behave correctly ?

Thanks in advance
jos
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

any explanation for thsi shell script behaviour

hello whats the difference between excuting a shell script as a)sh myscript.sh b). ./myscript.sh i noticed that my shell script works fine when i run it as . ./myscript .sh but fails when i run it as sh myscript.sh could anybody explain why. the shell script is very simple ... (9 Replies)
Discussion started by: xiamin
9 Replies

2. Shell Programming and Scripting

Help me to resolve uncertian behaviour of a sort command

I have got a file BeforeSort.txt having 40 fields seperated by "|" First field= RecordType (Value will be P or FP) Second field= CamCode Third field = UpdatingDate Fourth field = ProductType Fifth field = ActionCode (Value may be 01, 02 or 03) Sixth field = ProductCode and so on My... (1 Reply)
Discussion started by: pankajrai
1 Replies

3. UNIX for Advanced & Expert Users

Sort command - strange behaviour

Hi guys, I have the following example data: A;00:00:19 B;00:01:02 C;00:00:13 D;00:00:16 E;00:02:27 F;00:00:12 G;00:00:21 H;00:00:19 I;00:00:13 J;00:13:22 I run the following sort against it, yet the output is as follows: sort -t";" +1 -nr example_data.dat A;00:00:19 (16 Replies)
Discussion started by: miwinter
16 Replies

4. Shell Programming and Scripting

Spaces behaviour in shell

Hello, I am a bit puzzled by the way my shell treats spaces in filenames. An example will be way clearer than any explanation I can make: $ ls test\ file\ with\ spaces test file with spaces $ var="test\ file\ with\ spaces" $ echo $var test\ file\ with\ spaces $ ls $var ls: cannot... (4 Replies)
Discussion started by: SDelroen
4 Replies

5. Shell Programming and Scripting

How can i run the shell script from ABAP programming language

I am in need to execute the Files transferring's shell script from ABAP programming language. it would be highly appreciated if you help me as quickly as possible. Thank you (28 Replies)
Discussion started by: Venkat1818
28 Replies

6. Programming

How to pass the command line arguments to the shell script in c language?

hi, I am new in the shell script, and c programming with linux. I am looking to pass the arguments in c program that should be executed by the shell script. e.g. #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv) { int i; for (i=1;i<argc; i++) { ... (2 Replies)
Discussion started by: sharlin
2 Replies

7. Shell Programming and Scripting

Scripting using shell language

hi i am student and i learn computer sciences i need to write a script that can will be execute automatically when a user logs on a computer, and will be automatically disconnect after a duration that i will determine can somebody help me thank (1 Reply)
Discussion started by: Thucydide
1 Replies

8. What is on Your Mind?

Destroy All Software on strange language behaviour

I'll just leave this here (0 Replies)
Discussion started by: pludi
0 Replies

9. Shell Programming and Scripting

Korn shell behaviour in AIX

Hi, Consider the code snippet below: fun() { while read x do echo $x done < somefile_that_does_not_exists } fun echo I am here Korn shell on HPUX prints the message "I am here", while the behaviour is different on AIX korn shell. We do not get the message on AIX. Any... (5 Replies)
Discussion started by: 116@434
5 Replies

10. Programming

Sort behaviour

I see strange results when sorting with -n options and I wander if somebody can explain it. Input file and two results: $ cat aa 14 -1 11 -1 0 -1 0 $ sort -u aa -1 0 (1 Reply)
Discussion started by: migurus
1 Replies
euctoibmj(1)							   User Commands						      euctoibmj(1)

NAME
euctoibmj, ibmjtoeuc - Code conversion between Japanese EUC and IBM-Japanese SYNOPSIS
euctoibmj [-t] [-u code] [-U] [filename...] ibmjtoeuc [-u code] [-U] [filename...] AVAILABILITY
SUNWjfpu DESCRIPTION
euctoibmj converts the contents of the specified filenames from ASCII/ Japanese EUC to EBCDIC/IBM-Japanese. ibmjtoeuc converts the con- tents of the specified filenames from EBCDIC/IBM-Japanese to ASCII/ Japanese EUC. The both commands write the resultant code to stdout. If filename is not given, input characters are read from the standard input. For Japanese language handling, the euctoibmj/ibmjtoeucj pair of commands provide conversion only between the two code standards. Code con- version among Japanese EUC, JIS, and PC kanji are supported by another set of commands, jistoeuc(1) family or iconv(1). OPTIONS
-u code With this option specified, characters in one code set that do not have corresponding characters in the other are mapped to the code given in four-digit hexadecimal HOST CODE of IBM Japanese (for euctoibmj) or in four-digit JIS Ku-Ten code (for ibmjtoeuc). Without this option, such characters are mapped to HOST CODE 4040 (for euctoibmj) or JIS Ku-Ten code 0101 (for ibmjtoeuc). -U The output is not buffered (The default is buffered output). -t With this option specified, euctoibmj translates Half-Size Katakana (Code Set 2) in Japanese EUC to the corresponding characters in Code Set 1 prior to conversion. Without this option, Code Set 2 characters in Japanese EUC are processed to the illegal charac- ter. ENVIRONMENT VARIABLES
The environment variables LC_CTYPE and LANG control the character classification throughout these commands. For euctoibmj and ibmjtoeuc to work correctly, one or both of the environment variables must be set to ja or an equivalent locale. On entry to these commands, these envi- ronment variables are checked in the following order: LC_CTYPE and LANG. When a valid value is found, remaining environment variables for character classification are ignored. FILES
/usr/lib/jcodetables/ibmj-euc Code conversion table for IBM Japanese. SEE ALSO
iconv(1), jistoeuc(1), iconv_ja(5) DIAGNOSTICS
unexpected data encountered in input. Illegal character code is found in input file. BUGS
The ASCII/EBCDIC conversion table are taken from the 256 character standard in the CACM Nov, 1968. The conversion, while less blessed as a standard, corresponds better to certain IBM print train convertions. There is no universal solution. The Japanese EUC/IBM Japanese conversion table is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If JIS X 0212 caracter set is specified as input, euctoibmj can not support the conversion correctly. SunOS 5.10 10 Jan 2003 euctoibmj(1)
All times are GMT -4. The time now is 07:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy