Try using 'od' on it to see if there is a pattern you can recognize. Is it unicode, euc, jis, ebcdic, bcdic, or just an odd code page? Hard to say! 'I use 'od -bc' because I was octal-raised, but there are options for hex and decimal offsets. But yes, really, you should know!
Often, 'C' is linked to iso-8859-1 or Latin-1, but your file is not that.
Well, utf-8 and unicode have a pattern in their encoding. The dd command has an ebcdic decoder I have used. Might it be from big blue land?
Googling around the subject, one suggests file -i, another mentions enca http://linux.die.net/man/1/enca and for solaris, auto_ef. There is a 'chardet' python based tool.
Last edited by DGPickett; 01-04-2013 at 04:35 PM..
Yes tried file -i and it tells me it is a regular file. By big blue land, I assume you mean IBM? If that's the case yes I am using an AIX machine therefore auto_ef and enca are unrecognized commands. Yet to try chardet...I'll have to dig deeper. Thanks though!
Yes, IBM is a world unto itself, and ebcdic is the dominant charset, and even then to print right you may need the code page. BCDIC was the 6 bit code, Binary Coded Decimal Info Code, so called because it was closely related to card codes with a decimal basis, where A is 21 base 8, B is 22, I is 31 (20+9), then J is 41 through R at 51, then / is 61, S is 62 through Z is 71. The r-x-0 rows of the card became upper bits, and 1-9 were binary coded. EBCDIC is BCDIC Extended to 8 bits.
Hi all!!
I´m using command file -i myfile.xml to validate XML file encoding, but it is just saying regular file . I´m expecting / looking an output as UTF8 or ANSI / ASCII
Is there command to display the files encoding?
Thank you! (2 Replies)
Hi Experts,
Need your advise in determining the size of swap space in of the new HP-Ux server.
Server is having 32G of physical memory.
Ideally what amout of physical memory should be allocated as a swap space?
Following document from HP suggests to have minimum swap space... (2 Replies)
Hi,
I am beginner to Unix.
My requirement is to validate the encoding used in the incoming file(csv,txt).If it is encoded with UTF-8 format,then the file should remain as such otherwise i need to chnage the encoding to UTF-8.
Please advice me how to proceed on this. (7 Replies)
Hi,
I have got a zip (binary) file transferred from MacOS (thus it has additional __MACOSX directory packed inside). On extracting this zip, there are few *.xml files available. When I opened this *.xml file in vim editor using Cygwin (on windows) the editor displayed in the bottom. I tried... (4 Replies)
hi,
In my project i cannot determine the number of check list initially... I will know dynamically during execution... so How to specify the number of check list dynamically in zenity
Waiting for your precious Answer..... (1 Reply)
1. I have a shell script which creates a file using cat command. How can i find what encoding the file follows (e.g. UTF8, ANSI)?
2. I want to convert that file to PC-ANSI format. How can i achieve that?
I am using HP-Unix. (6 Replies)