I know of two menus I can call up with the mouse. One of them is titled "main options", and the other "VT Fonts". I use the 2nd one every now and then to change font size, e.g. when using a beamer, but it doesn't offer options for changing the font.
I have however, achieved a partial solution using
Code:
LESSCHARSET=utf-8 less UTF-8-demo.txt
in the new xterm, but there are still a lot of boxes...
Collegues
I tried to manipulate a UTF 8 data using the following script.
cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g'
But it says that cnot exicute binary file. Any solution.
Jaganadh.
Linguist (1 Reply)
Hi,
I try to get tr to replace multibytes characters by ascii equivalent. For example
"Je vais ŕ l'école" ---> 'Je vais a l'ecole"
But my version of tr (5.97) doesn't seem to support multibyte sets.
$ locale charmap; echo "Je vais ŕ l'école" | tr éŕ ea
UTF-8
Je vais aa l'aacole
I try to... (2 Replies)
hmmm... I was not sure where to post this! I want emit non-ascii chinese and ciryllic text. I'm running windows server 2003 with cygwin xfree86.
I know I have one font that can render chinese and russian: "Arial Unicode MS".
How can I configure my cygwin xterm so I can emit russian and... (1 Reply)
We just installed icu for UTF-8 compliance on our AIX 5.3 system. While usuing vi on some files we get the following error:
ex: 0602-169 Incomplete or invalid multibyte character encountere
yte character encountered, conversion failed.ex: 0602-169 Incomplete or invalidb
ractersultibyte... (0 Replies)
My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes.
int main(void)
{
setlocale(LC_ALL, "en_US.UTF-8");
printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n",... (8 Replies)
Hello everyone!
I have a problem with printing ru_RU.UTF-8 from AIX using lp command.
#locale -a
C
POSIX
RU_RU.UTF-8
RU_RU
en_US.8859-15
en_US.ISO8859-1
en_US
ru_RU.ISO8859-5
ru_RU
#locale
LANG=en_US.UTF-8
LC_COLLATE=RU_RU.UTF-8
LC_CTYPE=RU_RU.UTF-8
LC_MONETARY="en_US" (3 Replies)
I Am trying to change the file encoding from ASCII to UTF-8 using below command
iconv -f ASCII -t UTF-8 <input_file> > <output_file>
But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII.
While converting am not... (5 Replies)
Hi,
I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt
and i am getting some chinese characters as below which l opened the converted file on windows machine.
LANG=en_US.UTF-8... (3 Replies)
Hi
I have a big file which is in ansii . I want to convert it to UTF-16 .Please help me on this as I am stuck at this point in unix . (8 Replies)
Discussion started by: harry00514
8 Replies
LEARN ABOUT DEBIAN
dawg2wordlist
DAWG2WORDLIST(1)DAWG2WORDLIST(1)NAME
dawg2wordlist - convert a Tesseract DAWG to a wordlist
SYNOPSIS
dawg2wordlist UNICHARSET DAWG WORDLIST
DESCRIPTION dawg2wordlist(1) converts a Tesseract Directed Acyclic Word Graph (DAWG) to a list of words using a unicharset as key.
OPTIONS
UNICHARSET The unicharset of the language. This is the unicharset generated by mftraining(1).
DAWG The input DAWG, created by wordlist2dawg(1)
WORDLIST Plain text (output) file in UTF-8, one word per line
SEE ALSO tesseract(1), mftraining(1), wordlist2dawg(1), unicharset(5), combine_tessdata(1)
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3
COPYING
Copyright (C) 2012 Google, Inc. Licensed under the Apache License, Version 2.0
AUTHOR
The Tesseract OCR engine was written by Ray Smith and his research groups at Hewlett Packard (1985-1995) and Google (2006-present).
02/09/2012 DAWG2WORDLIST(1)