03-30-2011
Quote:
I am comparing two multibyte characters in two different platforms having different LC_CTYPE variables, they are returning different values.
Yes, that is quite possible. What mechanism are you using to compare them? Are you trying to see if they map to the same glyph or codepoint?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I need to split a string, either using awk or cut or basic unix commands (no programming) , with a multibyte charectar as a delimeter.
Ex:
abcd-efgh-ijkl
split by -efgh- to get two segments abcd & ijkl
Is it possible?
Thanks
A.H.S (1 Reply)
Discussion started by: azmathshaikh
1 Replies
2. Shell Programming and Scripting
Hello,
Is there any UNIX utility/command/executable that will convert mutlibyte characters to standard single byte ASCII characters in a given file?
and
Is there any UNIX utility/command/executable that will recognize multibyte characters in a given file name?
The typical multibyte... (8 Replies)
Discussion started by: jerardfjay
8 Replies
3. Shell Programming and Scripting
Hi! I'm trying to separate text into sentences, like this:
$pattern = "/(|]|,)**/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters? (2 Replies)
Discussion started by: Ilja
2 Replies
4. Shell Programming and Scripting
Hi! I'm trying to separate text into sentences, like this:
$pattern = "/(|]|,)**/";
preg_match_all($pattern, $text, $matches);
This works fine unless the text contains multibyte characters, like "åäö". How can I make this work with these exotic characters?
An example phrase that doesn't match:... (1 Reply)
Discussion started by: Ilja
1 Replies
5. Shell Programming and Scripting
Hi,
I have one doubt over sftp.
I am trnasferring a file from server1 to server2 using sftp.
The size of the file shows different in file 1 and file2 after sftp even though it shows same number of byte transferred. I don't understand the problem.
For example:
I have file1 having size... (3 Replies)
Discussion started by: siba.s.nayak
3 Replies
6. Programming
Compiling xpp (The X Printing Panel) on SL6 (RHEL6 essentially):
xpp.cxx: In constructor ‘printFiles::printFiles(int, char**, int&)’:
xpp.cxx:200: error: invalid conversion from ‘const char*’ to ‘char*’
The same error with all c++ constructors - gcc 4.4.4.
If anyone can throw any light on... (8 Replies)
Discussion started by: GSO
8 Replies
7. Shell Programming and Scripting
Hello all
Can anyone help me to solve the below issue
I want to take user input with space separated .The number of inputs can be variable
like if user inputs
1 2 3 4
ouput will stored in as array a where i=4 and I can retreive the value like a =3
any thoughts how to do it
... (2 Replies)
Discussion started by: Pratik4891
2 Replies
8. Shell Programming and Scripting
Hello,
I have two directories - prev and current . They both have same multiple subdirectories and files. Now the current directory can have some updated files and some new files added that is not in prev. I want to find the list of file names that differ. I am doing this because i can not... (2 Replies)
Discussion started by: jakSun8
2 Replies
9. Shell Programming and Scripting
I am using Solaris8, userA's shell '/usr/ace/prog/sdshell', AppuserB's shell '/bin/ksh'.
serverT:/home/userA>LC_CTYPE=iso_8859_1; export LC_CTYPE; vtemp='userA variable'; export vtemp
serverT:/home/userA>echo "LC_CTYPE=$LC_CTYPE, vtemp=$vtemp";
LC_CTYPE=iso_8859_1, vtemp=userA... (4 Replies)
Discussion started by: kchinnam
4 Replies
10. Shell Programming and Scripting
Hi
I have a requirement to insert a dot "." after a position in each line, say 110th position.
For which, I have written the below command.
cat filename | sed 's/./&\./110' > new_filename
The code is working fine, but when we have multi byte (2 or 3) characters in the input file, the... (3 Replies)
Discussion started by: tostay2003
3 Replies
LEARN ABOUT CENTOS
font::ttf::cmap
Font::TTF::Cmap(3) User Contributed Perl Documentation Font::TTF::Cmap(3)
NAME
Font::TTF::Cmap - Character map table
DESCRIPTION
Looks after the character map. For ease of use, the actual cmap is held in a hash against codepoint. Thus for a given table:
$gid = $font->{'cmap'}{'Tables'}[0]{'val'}{$code};
Note that $code should be a true value (0x1234) rather than a string representation.
INSTANCE VARIABLES
The instance variables listed here are not preceeded by a space due to their emulating structural information in the font.
Num Number of subtables in this table
Tables
An array of subtables ([0..Num-1])
Each subtables also has its own instance variables which are, again, not preceeded by a space.
Platform
The platform number for this subtable
Encoding
The encoding number for this subtable
Format
Gives the stored format of this subtable
Ver Gives the version (or language) information for this subtable
val A hash keyed by the codepoint value (not a string) storing the glyph id
METHODS
$t->read
Reads the cmap into memory. Format 4 subtables read the whole subtable and fill in the segmented array accordingly.
$t->ms_lookup($uni)
Finds a Unicode table, giving preference to the MS one, and looks up the given Unicode codepoint in it to find the glyph id.
$t->find_ms
Finds the a Unicode table, giving preference to the Microsoft one, and sets the "mstable" instance variable to it if found. Returns the
table it finds.
$t->ms_enc
Returns the encoding of the microsoft table (0 => symbol, etc.). Returns undef if there is no Microsoft cmap.
$t->out($fh)
Writes out a cmap table to a filehandle. If it has not been read, then just copies from input file to output
$t->XML_element($context, $depth, $name, $val)
Outputs the elements of the cmap in XML. We only need to process val here
$t->update
Tidies the cmap table.
Removes MS Fmt12 cmap if it is no longer needed.
Removes from all cmaps any codepoint that map to GID=0. Note that such entries will be re-introduced as necessary depending on the cmap
format.
@map = $t->reverse(%opt)
Returns a reverse map of the Unicode cmap. I.e. given a glyph gives the Unicode value for it. Options are:
tnum
Table number to use rather than the default Unicode table
array
Returns each element of reverse as an array since a glyph may be mapped by more than one Unicode value. The arrays are unsorted.
Otherwise store any one unicode value for a glyph.
is_unicode($index)
Returns whether the table of a given index is known to be a unicode table (as specified in the specifications)
BUGS
o No support for format 2 tables (MBCS)
AUTHOR
Martin Hosken Martin_Hosken@sil.org. See Font::TTF::Font for copyright and licensing.
perl v5.16.3 2012-02-23 Font::TTF::Cmap(3)