02-27-2018
If you don't know what codeset was used to encode a file, there isn't much that can be done to guess at what it might be.
It is easy to guess that it is just ASCII if there aren't any bytes with the high order bit set and there aren't any NUL bytes. It is easy to guess that it might be UTF-16 if every other byte is a NUL byte. Guessing that some text might be encoded in one of the EBCDIC codesets might not be too hard, but correctly guessing which variant is another matter. And, other than that, good luck. The differences between the various 8859-* character sets is only obvious to most people if you know what the text in the file is supposed to be beforehand.
10 More Discussions You Might Find Interesting
1. Programming
While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format.
I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies
2. UNIX for Dummies Questions & Answers
Hi, there:
I am relatively new to Unix. So, I am not even sure if I am asking is an easy or difficult task.
I want to peform GREP like command which will generate a list of files with a file format of UTF-8. I would especially like to know whether the files use UTF-8 or UTF-8N (in other... (0 Replies)
Discussion started by: kotoponus
0 Replies
3. UNIX for Advanced & Expert Users
Hi,
I have a non-ascii character (Ŵ), which can be represented in UTF-8 encoding as equivalent hex value (\xC5B4). Is there a function in unix to convert this hex value back to display the charcter ? (10 Replies)
Discussion started by: sumirmehta
10 Replies
4. Red Hat
Hello,
I am trying to convert a 7bit ASCII file to UTF-8.
I have used iconv before though it can't recognize it for some reason and says unknown file encoding.
When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file
It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies
5. AIX
Hello everyone!
I have a problem with printing ru_RU.UTF-8 from AIX using lp command.
#locale -a
C
POSIX
RU_RU.UTF-8
RU_RU
en_US.8859-15
en_US.ISO8859-1
en_US
ru_RU.ISO8859-5
ru_RU
#locale
LANG=en_US.UTF-8
LC_COLLATE=RU_RU.UTF-8
LC_CTYPE=RU_RU.UTF-8
LC_MONETARY="en_US" (3 Replies)
Discussion started by: burnAF
3 Replies
6. OS X (Apple)
I have two Macs running 10.7.5.
We download .txt files from remote site to these local Macs using 'rsync -e ssh -avz...'.
The files on Mac1 are in the required format of pure UTF-8. The files on Mac2 are in UTF-8 (no BOM) which is wrong format for us; these formats are indicated using BBEdit.... (1 Reply)
Discussion started by: sovdia
1 Replies
7. Linux
Hi,
I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt
and i am getting some chinese characters as below which l opened the converted file on windows machine.
LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies
8. AIX
Hello Gang
Can you please help me in installing EN_GB UTF-8 on AIX 5.3 .
I have worked on Solaris and Linux, but not familier with AIX so hoping if someone show the roadmap. (5 Replies)
Discussion started by: ningy
5 Replies
9. Shell Programming and Scripting
Hello all
i have utf-8 file that i try to convert to WINDOWS-1251 on linux
without any success
the file name is utf-8 when i try to do :
file -bi test.txt
it gives me :
text/plain; charset=utf-8
when i try to convert the file i do :
/usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies
10. Shell Programming and Scripting
I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following
It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc
1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies
LEARN ABOUT DEBIAN
btardump
BTARDUMP(1) General Commands Manual BTARDUMP(1)
NAME
btardump - Command line parser for Barry backup files
SYNOPSIS
btardump [-d db][-h][-i charset][-V] filename [filenames...]
DESCRIPTION
btardump takes one or more Barry backup tar files on the command line, and dumps parsed database records to stdout. By default, all
records are dumped, but this can be limited by the -d option.
OPTIONS
-d db Name of database to dump to stdout. Can be used multiple times to parse multiple databases at once. If not specified, all avail-
able databases from the backup file are dumped.
-h, --help
Show summary of options.
-i charset
Specifies the iconv charset to use for converting international strings. The Blackberry uses the WINDOWS-1252 charset, which is
incompatible with the more common code pages used in Linux. The most useful charset to use with this option is UTF-8, and is highly
recommended. Any other charset available via 'iconv --list' can be used here too, but may not be successful for some character con-
versions.
-V Enable vformat MIME output where available. Address Book gets printed in vCard format, Calendar in vEvent format, Memos in vJour-
nal, and Tasks in vTodo, etc.
AUTHOR
btardump is part of the Barry project.
SEE ALSO
http://www.netdirect.ca/barry
December 7, 2010 BTARDUMP(1)