Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Conversion from ansii to UTF 16 Post 302939291 by fpmurphy on Tuesday 24th of March 2015 12:05:27 PM
Old 03-24-2015
If the file is too large for iconv to handle, you are going to have to split it into two or more smaller files at a suitable boundary, convert the resultant files using iconv, and then join the files together again after the conversion.

Without knowing the structure and contents of your file, it is difficult to tell you how to successfully spit your file.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

UTF 8 and SED

Collegues I tried to manipulate a UTF 8 data using the following script. cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g' But it says that cnot exicute binary file. Any solution. Jaganadh. Linguist (1 Reply)
Discussion started by: jaganadh
1 Replies

2. Shell Programming and Scripting

replace UTF-8 characters with tr

Hi, I try to get tr to replace multibytes characters by ascii equivalent. For example "Je vais ŕ l'école" ---> 'Je vais a l'ecole" But my version of tr (5.97) doesn't seem to support multibyte sets. $ locale charmap; echo "Je vais ŕ l'école" | tr éŕ ea UTF-8 Je vais aa l'aacole I try to... (2 Replies)
Discussion started by: ripat
2 Replies

3. AIX

en_us.utf-8

please someone provide me the link for downloading en_us.utf-8 .....i have an issue with locale for which i need this :( (1 Reply)
Discussion started by: shubhendu.pyne
1 Replies

4. UNIX Desktop Questions & Answers

How to configure Xterm for UTF-8?

hmmm... I was not sure where to post this! I want emit non-ascii chinese and ciryllic text. I'm running windows server 2003 with cygwin xfree86. I know I have one font that can render chinese and russian: "Arial Unicode MS". How can I configure my cygwin xterm so I can emit russian and... (1 Reply)
Discussion started by: siegfried
1 Replies

5. UNIX for Advanced & Expert Users

UTF-8 to EBCDIC conversion in UNIX

Hi all, At present a file from AS400 system is being FTPed to an AIX system. Now, a similar file needs to be sent from our Unix box (Solaris) Is there any tool available which does the conversion in Unix from UTF-8 to EBCDIC? Any suggestions/ pointers are really appreciated. Thanks,... (4 Replies)
Discussion started by: sridhar_423
4 Replies

6. UNIX for Advanced & Expert Users

vi and UTF-8 errors

We just installed icu for UTF-8 compliance on our AIX 5.3 system. While usuing vi on some files we get the following error: ex: 0602-169 Incomplete or invalid multibyte character encountere yte character encountered, conversion failed.ex: 0602-169 Incomplete or invalidb ractersultibyte... (0 Replies)
Discussion started by: jlacasci
0 Replies

7. Programming

strlen for UTF-8

My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes. int main(void) { setlocale(LC_ALL, "en_US.UTF-8"); printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n",... (8 Replies)
Discussion started by: cyler
8 Replies

8. UNIX for Dummies Questions & Answers

UTF-8 in xterm

I need to use sort, uniq, grep, wc,... and the like to work with lists of words in UTF-8 (the "words" being phonetic transcriptions using the IPA). I have been using Google a lot and I even found at least one previous post on this topic, but it didn't help. I tried following the instructions... (2 Replies)
Discussion started by: mregine
2 Replies

9. Shell Programming and Scripting

ASCII to UTF-8 conversion

I Am trying to change the file encoding from ASCII to UTF-8 using below command iconv -f ASCII -t UTF-8 <input_file> > <output_file> But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII. While converting am not... (5 Replies)
Discussion started by: Sriranga
5 Replies

10. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies
Log::Report::Win32Locale(3pm)				User Contributed Perl Documentation			     Log::Report::Win32Locale(3pm)

NAME
Log::Report::Win32Locale - unix/windows locales INHERITANCE
Log::Report::Win32Locale is a Exporter DESCRIPTION
Windows uses different locales to represent languages: codepages. Programs which are written with Log::Report however, will contain ISO encoded language names; this module translates between them. The algorithms in this module are based on Win32::Locale and Win32::Codepage. FUNCTIONS
charset_encoding Returns the encoding name (usable with module Encode) based on the current codepage. For example, "cp1252" for iso-8859-1 (latin-1) or "cp932" for Shift-JIS Japanese. Returns undef if the encoding cannot be identified. codepage_to_iso(CODEPAGE) Translate windows CODEPAGE into ISO code. The CODEPAGE is numeric or a hex string like '0x0304'. iso_locale([CODEPAGE]) Returns the ISO string for the Microsoft codepage locale. Might return "undef"/false. By default, the actual codepage is used. iso_to_codepage(ISO) Returns the numeric value of the codepage. The ISO may look like this: "xx_YY". Then, first the "xx_YY" is looked-up. If that does not exist, "xx" is tried. ms_codepage_id Returns the numeric language ID for the current codepage language. For example, the numeric value for 0x0409 for "en-US", and 0x0411 for "ja". Returns false if the codepage cannot be identified. ms_install_codepage_id Returns the numeric language ID for the installed codepage language. This is like ms_codepage_id(), but refers to the codepage that was the default when Windows was first installed. ms_locale Returns the locale setting from the control panel. SYNOPSYS
# Only usable on Windows print codepage_to_iso(0x0413); # nl-NL print iso_to_codepage('nl_NL'); # 1043 printf "%x", iso_to_codepage('nl_NL'); # 413 my $iso = iso_locale(ms_codepage_id()); my $iso = iso_locale; # same print charset_encoding; # cp1252 print ms_codepage_id; # 1043 print ms_install_codepage_id; # 1043 print ms_locale; # Dutch (Netherlands) SEE ALSO
This module is part of Log-Report distribution version 0.94, built on August 23, 2011. Website: http://perl.overmeer.net/log-report/ LICENSE
Copyrights 2007-2011 by Mark Overmeer. For other contributors see ChangeLog. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html perl v5.14.2 2011-08-23 Log::Report::Win32Locale(3pm)
All times are GMT -4. The time now is 08:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy