03-10-2011
Locales and conversion tables confusion - Some characters don't get translated
Hello,
my program running on AIX receives a message (from MQ) which is in CCSID 870 (LATIN-2 EBCDIC). It needs to convert it to CCSID 912 (ISO-8859-2). For this conversion, it uses the Websphere MQ functionality, but which, according to the manual, uses AIX's system conversion tables. (Btw - it only works like this on AIX. On Linux or Windows, Websphere MQ uses its own coversion logic, ignoring system conversion tables - and it works there.)
The problem is, that some characters like š,č,ť,ž get translated to chr(26).
My checklist was:
1. The LANG environment variable is set to sk_SK
2. Conversion tables seem to be there
(root):/usr/lib/nls/loc/iconvTable# ls -l
...
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-1250_IBM-5346
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-1282_IBM-5346
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-1284_IBM-5346
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-1285_IBM-5346
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_IBM-1250
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_IBM-1282
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_IBM-1284
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_IBM-1285
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_IBM-852
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_IBM-870
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-5346_ISO8859-2
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-852_IBM-5346
-r--r--r-- 1 bin bin 288 Aug 12 2010 IBM-870_IBM-5346
-r--r--r-- 1 bin bin 288 Aug 12 2010 ISO8859-2_IBM-5346
Can you please help me find out what am I forgeting? Thanx.
8 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Can anyone tell me the difference between the 2 Locales:
North American Partial Locales
en_US.ISO8859-1
en_US.ISO8859-15
Does it matter which one to use?
What are the differences?
Thanks.
-I have found the below info but I don't know what it means(Also no RFC# for 8859-15):
... (2 Replies)
Discussion started by: finster
2 Replies
2. AIX
Currently we have the following locales installed on two of our servers
SERVER-1
$ locale -a
C
POSIX
en_US
en_US.8859-15
en_US.ISO8859-1
EN_US
EN_US.UTF-8
ZH_CN
ZH_CN.UTF-8
HI_IN
HI_IN.UTF-8
ko_KR
ko_KR.IBM-eucKR
zh_TW
zh_TW.IBM-eucTW
SERVER-2 (0 Replies)
Discussion started by: voorkey
0 Replies
3. Shell Programming and Scripting
I am trying to transpose tables listed in the format into format. Any help would be greatly appreciated.
Input:
test_data_1
1 2 90%
4 3 91%
5 4 90%
6 5 90%
9 6 90%
test_data_2
3 5 92%
5 4 92%
7 3 93%
9 2 92%
1 1 92%
...
Output:... (7 Replies)
Discussion started by: justthisguy
7 Replies
4. AIX
Hello,
I'm working with an AIX 5.3
This server receive emails from an application (ServiceDesk management) and these emails are read and integrated in another application via an EAI. I work for a french company with french people that write in french : with accentued characters (é,ç,à,...). Sadly,... (2 Replies)
Discussion started by: Djizus
2 Replies
5. Shell Programming and Scripting
I'm having a problem with my locale setting.
Essentially, when using Xmgrace or Awk, all . are replaced with ,
leading to wrong arithmetic.
Here is an explanation of the problem:
awk printing thousands with commas. see LC_ALL=en_US.UTF-8
How can I permanently set the locale... (1 Reply)
Discussion started by: chrisjorg
1 Replies
6. IP Networking
I am using Squid to create a proxy server for framework related to an application. My users sit on a private network in 191.xx.xx.xx space and my proxy sits in a different private network in 188.xx.xx.xx space. There will be a NAT in place to allow bidirectional communications but is there an... (5 Replies)
Discussion started by: BobSpero
5 Replies
7. Linux
I have a .CSV file when I check for the special characters in the file using the command cat -vet filename.csv, i get very lengthy lines with "^@", "^I^@" and "^@^M" characters in between each alphabet in all of the records. Using the code below file filename.csv I get the output as
I have a... (2 Replies)
Discussion started by: dhruuv369
2 Replies
8. UNIX for Beginners Questions & Answers
Hi,
Does anyone know if there is a script or program available out there that uses a conversion table to replace special characters from a file?
I am trying to remove some special characters from a file but there are several unprintable/control characters that some I need to remove but some I... (2 Replies)
Discussion started by: newbie_01
2 Replies
DD(1) General Commands Manual DD(1)
NAME
dd - convert and copy a file
SYNOPSIS
dd [option=value] ...
DESCRIPTION
Dd copies the specified input file to the specified output with possible conversions. The standard input and output are used by default.
The input and output block size may be specified to take advantage of raw physical I/O.
option values
if= input file name; standard input is default
of= output file name; standard output is default
ibs=n input block size n bytes (default 512)
obs=n output block size (default 512)
bs=n set both input and output block size, superseding ibs and obs; also, if no conversion is specified, it is particularly effi-
cient since no copy need be done
cbs=n conversion buffer size
skip=n skip n input records before starting copy
files=n copy n files from (tape) input
seek=n seek n records from beginning of output file before copying
count=n copy only n input records
conv=ascii convert EBCDIC to ASCII
ebcdic convert ASCII to EBCDIC
ibm slightly different map of ASCII to EBCDIC
lcase map alphabetics to lower case
ucase map alphabetics to upper case
swab swap every pair of bytes
noerror do not stop processing on an error
sync pad every input record to ibs
... , ... several comma-separated conversions
Where sizes are specified, a number of bytes is expected. A number may end with k, b or w to specify multiplication by 1024, 512, or 2
respectively; a pair of numbers may be separated by x to indicate a product.
Cbs is used only if ascii or ebcdic conversion is specified. In the former case cbs characters are placed into the conversion buffer, con-
verted to ASCII, and trailing blanks trimmed and new-line added before sending the line to the output. In the latter case ASCII characters
are read into the conversion buffer, converted to EBCDIC, and blanks added to make up an output record of size cbs.
After completion, dd reports the number of whole and partial input and output blocks.
For example, to read an EBCDIC tape blocked ten 80-byte EBCDIC card images per record into the ASCII file x:
dd if=/dev/rmt0 of=x ibs=800 cbs=80 conv=ascii,lcase
Note the use of raw magtape. Dd is especially suited to I/O on the raw physical devices because it allows reading and writing in arbitrary
record sizes.
To skip over a file before copying from magnetic tape do (dd of=/dev/null; dd of=x) </dev/rmt0
SEE ALSO
cp(1), tr(1)
DIAGNOSTICS
f+p records in(out): numbers of full and partial records read(written)
BUGS
The ASCII/EBCDIC conversion tables are taken from the 256 character standard in the CACM Nov, 1968. The `ibm' conversion, while less
blessed as a standard, corresponds better to certain IBM print train conventions. There is no universal solution.
Newlines are inserted only on conversion to ASCII; padding is done only on conversion to EBCDIC. These should be separate options.
DD(1)