Sponsored Content
Top Forums Shell Programming and Scripting Problem identifying charset of a file Post 302295477 by sridhar_423 on Sunday 8th of March 2009 03:08:42 PM
Old 03-08-2009
Question Problem identifying charset of a file

Hi all,

My objective is to find out the charset using which a file is encoded. (The OS is SunOs)
I have set NLS_LANG to AR8MSWIN1256 and spooled the file.

When viewed the file using vi, I saw the following
\307\341\321\355\307\326

I then inserted the line containing these codes in a table by setting NLS_LANG to AL32UTF8 and saw the Arabic text
الرياض

Now, what are these 307, 341 .. numbers? Are these the code points? If that is the case, they should be of Windows 1256 cp as I have set NLS_LANG to AR8MSWIN1256. Also, are they in decimal/ hex/ oct?

Can anyone tell me how can i arrive at the arabic text by using those numbers?
I tried something like this in a HTML page without any luck
& #307;& #341;& #321;& #355;& #307;& #326;
& #775;& #833;& #801;& #853;& #775;& #806; (I have kept a space between & and # to avoid the browser rendering them as symbols/characters)

Thanks,
Sridhar

Last edited by sridhar_423; 03-08-2009 at 04:23 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Unix charset

Hi, How can I find out the charset on a Unix server (SUNOS 5.2)? I tried locale charmap and returned 646. What does 646 mean? If I send an xml file with encoding="utf-8", should the server be able to handle the file, even with special characters in it? Thanks. (0 Replies)
Discussion started by: iengca
0 Replies

2. Shell Programming and Scripting

identifying null values in a file

I have a huge file with 20 fileds in each record and each field is seperated by "|". If i want to get all the reocrds that have 18th or for that matter any filed as null how can i do it? Please let me know (3 Replies)
Discussion started by: dsravan
3 Replies

3. Shell Programming and Scripting

Identifying suffixes in a file and printing them out

Hello, I am interested in finding and identifying suffixes for Indian names through an awk script or a perl program. Suffixes normally are found at the end of a word as is shown in the sample given below. What I need is a perl script which will identify suffixes of a defined lenght to be given in... (4 Replies)
Discussion started by: gimley
4 Replies

4. UNIX for Dummies Questions & Answers

locale and glibc and charset

what's the relationship among locale, glibc, charset, charmap and fonts? why locale needs to be generated by glibc? how? what are in the locale-archive file? and what are in font files? (0 Replies)
Discussion started by: vistastar
0 Replies

5. Shell Programming and Scripting

Identifying the file completion

Hi, A script is running for multiple databases so data is also being populated for multiple DBs in a.txt file. I need to rename this file once all the data is populated. Kindly suggest me How can I check once file is populated completely before renaming? Thanks in advance. (3 Replies)
Discussion started by: ravigupta2u
3 Replies

6. UNIX for Advanced & Expert Users

ISO 88591 file encoding charset in Linux

Hello Experts, please help to provide any insight as I am facing issue migrating java application from hpux to redhat. The java program is using InputStreamReader to read a file without specifying any charset parameter. However, in new Linux Redhat 5.6 environent, when reading a file that... (1 Reply)
Discussion started by: sonic_air
1 Replies

7. Shell Programming and Scripting

Identifying presence and name of new file(s)?

I have an HP-UX server that runs a script each night. The script connects to an SFTP server and downloads all xml files (if any are present) from a certain folder, and then deletes the files from the SFTP server. So sometimes it will download a new file, sometimes it will download 2 or 3 new... (4 Replies)
Discussion started by: lupin..the..3rd
4 Replies

8. Shell Programming and Scripting

Identifying Missing File Sequence

Hi, I have a file which contains few columns and the first column has the file names, and I would like to identify the missing file sequence number form the file and would copy to another file. My files has data in below format. APKRISPSIN320131231201319_0983,1,54,125,... (5 Replies)
Discussion started by: rramkrishnas
5 Replies

9. Red Hat

How to load a charset on RHEL 6.6 ?

Hi all, am running the following code on a RHEL 6.6 box to list which charsets are loaded and which are available: #!/usr/bin/perl -w use strict; use Encode; my @list = Encode->encodings(); my @all_encodings = Encode->encodings(":all"); print "@list\n\n"; print "@all_encodings\n"; ... (3 Replies)
Discussion started by: Fundix
3 Replies

10. Shell Programming and Scripting

Identifying missing file dates

Hi Experts, I have written the below script to check the missing files based on the date in the file name from current date to in a given interval of days. In the file names we have dates along with some name. ex:jera_sit_2017-04-25-150325.txt. The below script is working fine if we have only... (10 Replies)
Discussion started by: nalu
10 Replies
iconv_852(5)						Standards, Environments, and Macros					      iconv_852(5)

NAME
iconv_852 - code set conversion tables for MS 852 (MS-DOS Latin 2) DESCRIPTION
The following code set conversions are supported: +--------------------------------------------------------------------+ | Code Set Conversions Supported | +--------------+--------+--------------+--------+--------------------+ | Code |Symbol |Target Code |Symbol | Target Output | +--------------+--------+--------------+--------+--------------------+ |MS 852 |dos2 |ISO 8859-2 |iso2 | ISO Latin 2 | +--------------+--------+--------------+--------+--------------------+ |MS 852 |dos2 |MS 1250 |win2 | Windows Latin 2 | +--------------+--------+--------------+--------+--------------------+ |MS 852 |dos2 |Mazovia |maz | Mazovia | +--------------+--------+--------------+--------+--------------------+ |MS 852 |dos2 |DHN |dhn | Dom Handlowy Nauki | +--------------+--------+--------------+--------+--------------------+ CONVERSIONS
The conversions are performed according to the following tables. All values in the tables are given in octal. MS 852 to ISO 8859-2 For the conversion of MS 852 to ISO 8859-2, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | MS 852 | ISO 8859-2 | MS 852 | ISO 8859-2 | |24-177 | 40 |271-274 |40 | |200 | 307 |275 |257 | |201 | 374 |276 |277 | |202 | 351 |277-305 |40 | |203 | 342 |306 |303 | |204 | 344 |307 |343 | |205 | 371 |310-316 |40 | |206 | 346 |317 |244 | |207 | 347 |320 |360 | |210 | 263 |321 |320 | |211 | 353 |322 |317 | |212 | 325 |323 |313 | |213 | 365 |324 |357 | |214 | 356 |325 |322 | |215 | 254 |326 |315 | |216 | 304 |327 |316 | |217 | 306 |330 |354 | |220 | 311 |331-334 |40 | |221 | 305 |335 |336 | |222 | 345 |336 |331 | |223 | 364 |337 |40 | |224 | 366 |340 |323 | |225 | 245 |341 |337 | |226 | 265 |342 |324 | |227 | 246 |343 |321 | |230 | 266 |344 |361 | |231 | 326 |345 |362 | |232 | 334 |346 |251 | |233 | 253 |347 |271 | |234 | 273 |350 |300 | |235 | 243 |351 |332 | |236 | 327 |352 |340 | |237 | 350 |353 |333 | |240 | 341 |354 |375 | |241 | 355 |355 |335 | |242 | 363 |356 |376 | |243 | 372 |357 |264 | |244 | 241 |360 |255 | |245 | 261 |361 |275 | |246 | 256 |362 |262 | |247 | 276 |363 |267 | |250 | 312 |364 |242 | |251 | 352 |365 |247 | |252 | 40 |366 |367 | |253 | 274 |367 |270 | |254 | 310 |370 |260 | |255 | 272 |371 |250 | |256-264 | 40 |372 |377 | |265 | 301 |374 |330 | |266 | 302 |375 |370 | |267 | 314 |376 |40 | |270 | 252 | | | +---------------+----------------+----------------+---------------+ MS 852 to MS 1250 For the conversion of MS 852 to MS 1250, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | MS 852 | MS 1250 | MS 852 | MS 1250 | |200 | 307 |270 |252 | |201 | 374 |271-274 |40 | |202 | 351 |275 |257 | |203 | 342 |276 |277 | |204 | 344 |277-305 |40 | |205 | 371 |306 |303 | |206 | 346 |307 |343 | |207 | 347 |310-316 |40 | |210 | 263 |317 |244 | |211 | 353 |320 |360 | |212 | 325 |321 |320 | |213 | 365 |322 |317 | |214 | 356 |323 |313 | |215 | 217 |324 |357 | |216 | 304 |325 |322 | |217 | 306 |326 |315 | |220 | 311 |327 |316 | |221 | 305 |330 |354 | |222 | 345 |331-334 |40 | |223 | 364 |335 |336 | |224 | 366 |336 |331 | |225 | 274 |337 |40 | |226 | 276 |340 |323 | |227 | 214 |341 |337 | |230 | 234 |342 |324 | |231 | 326 |343 |321 | |232 | 334 |344 |361 | |233 | 215 |345 |362 | |234 | 235 |346 |212 | |235 | 243 |347 |232 | |236 | 327 |350 |300 | |237 | 350 |351 |332 | |240 | 341 |352 |340 | |241 | 355 |353 |333 | |242 | 363 |354 |375 | |243 | 372 |355 |335 | |244 | 245 |356 |376 | |245 | 271 |357 |264 | |246 | 216 |360 |255 | |247 | 236 |361 |275 | |250 | 312 |362 |262 | |251 | 352 |363 |241 | |252 | 254 |364 |242 | |253 | 237 |365 |247 | |254 | 310 |366 |367 | |255 | 272 |367 |270 | |256 | 253 |370 |260 | |257 | 273 |371 |250 | |260-264 | 40 |372 |377 | |265 | 301 |374 |330 | |266 | 302 |375 |370 | |267 | 314 |376 |40 | +---------------+----------------+----------------+---------------+ MS 852 to Mazovia For the conversion of MS 852 to Mazovia, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | MS 852 | Mazovia | MS 852 | Mazovia | |205 | 40 |246-247 |40 | |206 | 215 |250 |220 | |210 | 222 |251 |221 | |212-213 | 40 |253 |246 | |215 | 240 |254-270 |40 | |217 | 225 |275 |241 | |220-226 | 40 |276 |247 | |227 | 230 |306-336 |40 | |230 | 236 |340 |243 | |233-234 | 40 |342 |40 | |235 | 234 |343 |245 | |236-243 | 40 |344 |244 | |244 | 217 |345-375 |40 | |245 | 206 | | | +---------------+----------------+----------------+---------------+ MS 852 to DHN For the conversion of MS 852 to DHN, all characters not in the following table are mapped unchanged. +-----------------------------------------------------------------+ | | Conversions|Performed | | | MS 852 | DHN | MS 852 | DHN | |200-205 | 40 |244 |200 | |206 | 212 |245 |211 | |207 | 40 |246-247 |40 | |210 | 214 |250 |202 | |211-214 | 40 |251 |213 | |215 | 207 |253 |220 | |216 | 40 |254-270 |40 | |217 | 201 |275 |210 | |220-226 | 40 |276 |221 | |227 | 206 |306-336 |40 | |230 | 217 |340 |205 | |233-234 | 40 |342 |40 | |235 | 203 |343 |204 | |236-237 | 40 |344 |215 | |242 | 216 |345-375 |40 | |252 | 254 | | | +---------------+----------------+----------------+---------------+ FILES
/usr/lib/iconv/*.so conversion modules /usr/lib/iconv/*.t conversion tables /usr/lib/iconv/iconv_data list of conversions supported by conversion tables SEE ALSO
iconv(1), iconv(3C), iconv(5) SunOS 5.10 18 Apr 1997 iconv_852(5)
All times are GMT -4. The time now is 10:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy