Converting Unicode file to UTF8 format


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Converting Unicode file to UTF8 format
# 8  
Old 08-27-2009
Quote:
Originally Posted by CRGreathouse
Code:
file <input file>

should do it.
I found the command
enca
available on Linux
In my case, i installed on CentOS 4 the rpm
enca-1.9-4.el4.rf.i386.rpm
and the command gives following output
a) on a file supposed to be in UTF-8
[root@mini figari]# enca -L none ./fr/texte_titre.html
Universal transformation format 8 bits; UTF-8
b) on a file supposed to be in ISO8859-1
[root@mini figari]# enca -L none ./en/texte_accueil.html
7bit ASCII characters


(but ideally, enca should be configured with french language to help format recognition, unfortunately, french seems not to be included Smilie)
my 2 cents
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

help required in converting a file format

My file format: -------------------------------------------------- Complete Consistency Check Valid Area : VALID:VALID Started by : esanwad Started at : Thu Dec 11 16:04:46 2014 CNA version : R21H04_EC08 Check range : AREA VALID/VALID ... (4 Replies)
Discussion started by: Gautam Banerjee
4 Replies

2. Shell Programming and Scripting

Need help in converting the file format

Hi All, I need help in converting the mentioned file format into desired output format using awk. Could anyone help me in this? Below is the input.. Date Account Campaign AdGroup Keyword Conversion Revenue Var1 Var2 Var3 Var4 Var5 10 20 30 ... (8 Replies)
Discussion started by: Ravi S M
8 Replies

3. Shell Programming and Scripting

Help with Converting UTF-8 data to Unicode

How can I get an error when converting 3rd line, since it has invalid characters abcde a®cdée a�cd� Unicode for ® = é = I used "iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt" (2 Replies)
Discussion started by: arunbs
2 Replies

4. Shell Programming and Scripting

Converting windows format file to unix format using script

Hi, I am having couple of files which i used to copy from windows to Linux, so now in case of text files (CTRL^M) appears at end of line. I know i can convert this windows format file to unix format file by running dos2unix. My requirement here is that i want to do it automatically using a... (5 Replies)
Discussion started by: sarbjit
5 Replies

5. Shell Programming and Scripting

Converting file format

My input file is Pipe delimited with 10 fields, I am trying to create a tab delimited output file with 6 fields from the provided input file. Below is sample data Input file abc||2|PIN|num||||www.123.com|abc@123.com| bcd||2|PIN|num|||||abc@123.com|... (3 Replies)
Discussion started by: pasupuleti81
3 Replies

6. Shell Programming and Scripting

converting config file to csv format

Hello, For 2 days now i've been searching for a solution to this. I am now beginning to doubt this is even possible. It's even harder when you don't know how to search for it. (which keywords generate enough relevancy etc..) I need to parse a config file to generate a CSV file in return. It... (7 Replies)
Discussion started by: zer0dvide
7 Replies

7. UNIX for Dummies Questions & Answers

Convert UTF8 Format file to ANSI format

:confused: Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on... (9 Replies)
Discussion started by: rajreddy
9 Replies

8. UNIX for Advanced & Expert Users

Convert UTF8 Format file to ANSI format

:) Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on this.........Let me... (1 Reply)
Discussion started by: rajreddy
1 Replies

9. Shell Programming and Scripting

converting string to unicode

How can I can convert a string in a shell script that looks something like: ]] to unicode equivalent? thanks a lot, webtekie (1 Reply)
Discussion started by: webtekie
1 Replies

10. UNIX for Dummies Questions & Answers

Converting the File Creation Date to a new format

I need to capture a file's creation/modification date and time and convert this to a different format, whilst I can easily get the existing format from a ls -l | awk ' { print $......}' or a cut command I do not know how to convert it to a desired format? I should add that at present the ls -l... (1 Reply)
Discussion started by: barney_clough
1 Replies
Login or Register to Ask a Question
Tcl_UniCharIsAlpha(3)					      Tcl Library Procedures					     Tcl_UniCharIsAlpha(3)

__________________________________________________________________________________________________________________________________________________

NAME
Tcl_UniCharIsAlnum, Tcl_UniCharIsAlpha, Tcl_UniCharIsControl, Tcl_UniCharIsDigit, Tcl_UniCharIsGraph, Tcl_UniCharIsLower, Tcl_UniCharIsPrint, Tcl_UniCharIsPunct, Tcl_UniCharIsSpace, Tcl_UniCharIsUpper, Tcl_UniCharIsWordChar - routines for classification of Tcl_UniChar characters SYNOPSIS
#include <tcl.h> int Tcl_UniCharIsAlnum(ch) int Tcl_UniCharIsAlpha(ch) int Tcl_UniCharIsControl(ch) int Tcl_UniCharIsDigit(ch) int Tcl_UniCharIsGraph(ch) int Tcl_UniCharIsLower(ch) int Tcl_UniCharIsPrint(ch) int Tcl_UniCharIsPunct(ch) int Tcl_UniCharIsSpace(ch) int Tcl_UniCharIsUpper(ch) int Tcl_UniCharIsWordChar(ch) ARGUMENTS
int ch (in) The Tcl_UniChar to be examined. _________________________________________________________________ DESCRIPTION
All of the routines described examine Tcl_UniChars and return a boolean value. A non-zero return value means that the character does belong to the character class associated with the called routine. The rest of this document just describes the character classes associated with the various routines. Note: A Tcl_UniChar is a Unicode character represented as an unsigned, fixed-size quantity. CHARACTER CLASSES
Tcl_UniCharIsAlnum tests if the character is an alphanumeric Unicode character. Tcl_UniCharIsAlpha tests if the character is an alphabetic Unicode character. Tcl_UniCharIsControl tests if the character is a Unicode control character. Tcl_UniCharIsDigit tests if the character is a numeric Unicode character. Tcl_UniCharIsGraph tests if the character is any Unicode print character except space. Tcl_UniCharIsLower tests if the character is a lowercase Unicode character. Tcl_UniCharIsPrint tests if the character is a Unicode print character. Tcl_UniCharIsPunct tests if the character is a Unicode punctuation character. Tcl_UniCharIsSpace tests if the character is a whitespace Unicode character. Tcl_UniCharIsUpper tests if the character is an uppercase Unicode character. Tcl_UniCharIsWordChar tests if the character is alphanumeric or a connector punctuation mark. KEYWORDS
unicode, classification Tcl 8.1 Tcl_UniCharIsAlpha(3)

Featured Tech Videos