04-18-2010
Quote:
Originally Posted by
pludi
- even if you do, the changes won't show until you recompile the C library (and maybe the kernel), where the sizes of the datatypes are used.
Ah, that's why. I thought it would only apply to the program being compiled.
Quote:
Originally Posted by
jim mcnamara
The size of a UTF-8 (8 bits) char is one byte.
By default, yes, because in ASCII, 1 char = 1 byte, but making it equal to 32 bits (4 bytes) would be better suited for UTF-8 where chars can equal 4 bytes.
One of the things I wanted to point out: if 'a' is 4 bytes, isn't it overflowing char which is 1 byte ?
Last edited by cyler; 04-18-2010 at 02:00 PM..
10 More Discussions You Might Find Interesting
1. Programming
hello,
i have a problem with strlen. I have written this:
for(y=13,z=0; cInBuf!=' ';y++)
{
cBuf=cInBuf;
z++;
}
len = strlen(cBuf);
out=len/2;
fprintf(outfile,"F%i",out);
If strlen is e.g. 22, it write F22. I want to write F2F2.
How can i do this?... (5 Replies)
Discussion started by: ACeD
5 Replies
2. Shell Programming and Scripting
Hello,
Just a little problem with the ksh function : strlen
I want to use this function in this little ksh program :
while read line ; do
TOTO=$line
TOTONB=strlen($TOTO)
echo $TOTONB (3 Replies)
Discussion started by: steiner
3 Replies
3. Shell Programming and Scripting
Collegues
I tried to manipulate a UTF 8 data using the following script.
cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g'
But it says that cnot exicute binary file. Any solution.
Jaganadh.
Linguist (1 Reply)
Discussion started by: jaganadh
1 Replies
4. Programming
In a declaration, I have:
const char comment_begin = "<!--";
const char comment_end = "-->";
const int comment_begin_len = strlen(comment_begin);
const int comment_end_len = strlen(comment_end);
When I compile, I get the warnings:
emhttpc.c:64: warning: initializer element is not... (10 Replies)
Discussion started by: cleopard
10 Replies
5. Programming
I have been getting some flack recently for my use of strlen() and strnlen(). Honestly I have always just taken their functionality for granted as being the easiest way of getting the length of a string. Is it really so much better to do pointer arithmetic? What am I gaining besides more... (3 Replies)
Discussion started by: jjinno
3 Replies
6. UNIX for Advanced & Expert Users
We just installed icu for UTF-8 compliance on our AIX 5.3 system. While usuing vi on some files we get the following error:
ex: 0602-169 Incomplete or invalid multibyte character encountere
yte character encountered, conversion failed.ex: 0602-169 Incomplete or invalidb
ractersultibyte... (0 Replies)
Discussion started by: jlacasci
0 Replies
7. UNIX for Dummies Questions & Answers
I need to use sort, uniq, grep, wc,... and the like to work with lists of words in UTF-8 (the "words" being phonetic transcriptions using the IPA). I have been using Google a lot and I even found at least one previous post on this topic, but it didn't help.
I tried following the instructions... (2 Replies)
Discussion started by: mregine
2 Replies
8. Linux
Hi,
I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt
and i am getting some chinese characters as below which l opened the converted file on windows machine.
LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies
9. Programming
Hello,
This function was copied into my code, which was compiled without error/warning, but when executed there is always Segmentation fault at the end after the output (which seems correct!):
void get_hashes(unsigned int hash, unsigned char *in)
{
unsigned char *str = in;
int pos =... (7 Replies)
Discussion started by: yifangt
7 Replies
10. Shell Programming and Scripting
greetings,
pretty new to php and i think i might be missing some fundamental limitation of isset. i have two php scripts below that are executed by crond, one using --host X and one that does not. and below that are three different attempts at generating a command line that will be executed. the... (8 Replies)
Discussion started by: crimso
8 Replies
LEARN ABOUT DEBIAN
caca_utf32_to_utf8
libcaca character set conversions(3caca) libcaca libcaca character set conversions(3caca)
NAME
libcaca character set conversions -
Functions
__extern uint32_t caca_utf8_to_utf32 (char const *, size_t *)
Convert a UTF-8 character to UTF-32.
__extern size_t caca_utf32_to_utf8 (char *, uint32_t)
Convert a UTF-32 character to UTF-8.
__extern uint8_t caca_utf32_to_cp437 (uint32_t)
Convert a UTF-32 character to CP437.
__extern uint32_t caca_cp437_to_utf32 (uint8_t)
Convert a CP437 character to UTF-32.
__extern char caca_utf32_to_ascii (uint32_t)
Convert a UTF-32 character to ASCII.
__extern int caca_utf32_is_fullwidth (uint32_t)
Tell whether a UTF-32 character is fullwidth.
Detailed Description
These functions perform conversions between usual character sets.
Function Documentation
__extern uint32_t caca_utf8_to_utf32 (char const *s, size_t *bytes) Convert a UTF-8 character read from a string and return its value in the
UTF-32 character set. If the second argument is not null, the total number of read bytes is written in it.
If a null byte was reached before the expected end of the UTF-8 sequence, this function returns zero and the number of read bytes is set to
zero.
This function never fails, but its behaviour with illegal UTF-8 sequences is undefined.
Parameters:
s A string containing the UTF-8 character.
bytes A pointer to a size_t to store the number of bytes in the character, or NULL.
Returns:
The corresponding UTF-32 character, or zero if the character is incomplete.
Referenced by caca_put_str().
__extern size_t caca_utf32_to_utf8 (char *buf, uint32_tch) Convert a UTF-32 character read from a string and write its value in the UTF-8
character set into the given buffer.
This function never fails, but its behaviour with illegal UTF-32 characters is undefined.
Parameters:
buf A pointer to a character buffer where the UTF-8 sequence will be written.
ch The UTF-32 character.
Returns:
The number of bytes written.
__extern uint8_t caca_utf32_to_cp437 (uint32_tch) Convert a UTF-32 character read from a string and return its value in the CP437 character
set, or '?' if the character has no equivalent.
This function never fails.
Parameters:
ch The UTF-32 character.
Returns:
The corresponding CP437 character, or '?' if not representable.
__extern uint32_t caca_cp437_to_utf32 (uint8_tch) Convert a CP437 character read from a string and return its value in the UTF-32 character
set, or zero if the character is a CP437 control character.
This function never fails.
Parameters:
ch The CP437 character.
Returns:
The corresponding UTF-32 character, or zero if not representable.
__extern char caca_utf32_to_ascii (uint32_tch) Convert a UTF-32 character into an ASCII character. When no equivalent exists, a graphically
close equivalent is sought.
This function never fails, but its behaviour with illegal UTF-32 characters is undefined.
Parameters:
ch The UTF-32 character.
Returns:
The corresponding ASCII character, or a graphically close equivalent if found, or '?' if not representable.
__extern int caca_utf32_is_fullwidth (uint32_tch) Check whether the given UTF-32 character should be printed at twice the normal width
(fullwidth characters). If the character is unknown or if its status cannot be decided, it is treated as a standard-width character.
This function never fails.
Parameters:
ch The UTF-32 character.
Returns:
1 if the character is fullwidth, 0 otherwise.
Referenced by caca_put_char(), and caca_put_str().
Author
Generated automatically by Doxygen for libcaca from the source code.
Version 0.99.beta18 Fri Apr 6 2012 libcaca character set conversions(3caca)