Sponsored Content
Full Discussion: strlen for UTF-8
Top Forums Programming strlen for UTF-8 Post 302413932 by cyler on Sunday 18th of April 2010 12:04:50 PM
Old 04-18-2010
strlen for UTF-8

My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes.
Code:
int main(void)
{
    setlocale(LC_ALL, "en_US.UTF-8");

    printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n", sizeof(char), sizeof('a'), sizeof('€'), strlen("€"));

    return 0;
}

Quote:
Char size: 1
Size of char 'a': 4
Size of Euro sign '€': 4
Length of Euro sign: 3
How can I make it output this which is more logical. I modified CHAR_BIT in limits.h to 32 but it still outputs 1 byte. Did I do something wrong ?
Quote:
Char size: 4
Size of char 'a': 4
Size of Euro sign '€': 4
Length of Euro sign: 1
 

10 More Discussions You Might Find Interesting

1. Programming

Problems with Strlen

hello, i have a problem with strlen. I have written this: for(y=13,z=0; cInBuf!=' ';y++) { cBuf=cInBuf; z++; } len = strlen(cBuf); out=len/2; fprintf(outfile,"F%i",out); If strlen is e.g. 22, it write F22. I want to write F2F2. How can i do this?... (5 Replies)
Discussion started by: ACeD
5 Replies

2. Shell Programming and Scripting

Problem with the strlen function in ksh

Hello, Just a little problem with the ksh function : strlen I want to use this function in this little ksh program : while read line ; do TOTO=$line TOTONB=strlen($TOTO) echo $TOTONB (3 Replies)
Discussion started by: steiner
3 Replies

3. Shell Programming and Scripting

UTF 8 and SED

Collegues I tried to manipulate a UTF 8 data using the following script. cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g' But it says that cnot exicute binary file. Any solution. Jaganadh. Linguist (1 Reply)
Discussion started by: jaganadh
1 Replies

4. Programming

'strlen' of a constant string

In a declaration, I have: const char comment_begin = "<!--"; const char comment_end = "-->"; const int comment_begin_len = strlen(comment_begin); const int comment_end_len = strlen(comment_end); When I compile, I get the warnings: emhttpc.c:64: warning: initializer element is not... (10 Replies)
Discussion started by: cleopard
10 Replies

5. Programming

pointer arithmetic vs. strlen() & strnlen()?

I have been getting some flack recently for my use of strlen() and strnlen(). Honestly I have always just taken their functionality for granted as being the easiest way of getting the length of a string. Is it really so much better to do pointer arithmetic? What am I gaining besides more... (3 Replies)
Discussion started by: jjinno
3 Replies

6. UNIX for Advanced & Expert Users

vi and UTF-8 errors

We just installed icu for UTF-8 compliance on our AIX 5.3 system. While usuing vi on some files we get the following error: ex: 0602-169 Incomplete or invalid multibyte character encountere yte character encountered, conversion failed.ex: 0602-169 Incomplete or invalidb ractersultibyte... (0 Replies)
Discussion started by: jlacasci
0 Replies

7. UNIX for Dummies Questions & Answers

UTF-8 in xterm

I need to use sort, uniq, grep, wc,... and the like to work with lists of words in UTF-8 (the "words" being phonetic transcriptions using the IPA). I have been using Google a lot and I even found at least one previous post on this topic, but it didn't help. I tried following the instructions... (2 Replies)
Discussion started by: mregine
2 Replies

8. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

9. Programming

Segment fault related to strlen.S

Hello, This function was copied into my code, which was compiled without error/warning, but when executed there is always Segmentation fault at the end after the output (which seems correct!): void get_hashes(unsigned int hash, unsigned char *in) { unsigned char *str = in; int pos =... (7 Replies)
Discussion started by: yifangt
7 Replies

10. Shell Programming and Scripting

PHP: declared variables, strlen vs isset

greetings, pretty new to php and i think i might be missing some fundamental limitation of isset. i have two php scripts below that are executed by crond, one using --host X and one that does not. and below that are three different attempts at generating a command line that will be executed. the... (8 Replies)
Discussion started by: crimso
8 Replies
elf_fsize(3E)															     elf_fsize(3E)

NAME
elf32_fsize, elf64_fsize - return the size of an object file type for elf32 files, return the size of an object file type for elf64 files, respectively. SYNOPSIS
[flag... ] file... [library] ... DESCRIPTION
gives the size in bytes of the 32-bit file representation of count data objects with the given type. The library uses version ver to cal- culate the size (see elf(3E) and elf_version(3E)). Constant values are available for the sizes of fundamental types. Elf_Type File Size Memory Size ------------------------------------------------------ ELF_T_ADDR ELF32_FSZ_ADDR sizeof(Elf32_Addr) ELF_T_BYTE 1 sizeof(unsigned char) ELF_T_HALF ELF32_FSZ_HALF sizeof(Elf32_Half) ELT_T_OFF ELF32_FSZ_OFF sizeof(Elf32_Off) ELF_T_SWORD ELF32_FSZ_SWORD sizeof(Elf32_Sword) ELF_T_WORD ELF32_FSZ_WORD sizeof(Elf32_Word) returns zero if the value of type or ver is unknown. See elf_xlate(3E) for a list of the type values. gives the size in bytes of the 64-bit file representation of count data objects with the given type. The library uses version ver to cal- culate the size (see elf(3E) and elf_version(3E)). Constant values are available for the sizes of fundamental types. Elf_Type File Size Memory Size ------------------------------------------------------ ELF_T_ADDR ELF64_FSZ_ADDR sizeof(Elf64_Addr) ELF_T_BYTE 1 sizeof(unsigned char) ELF_T_HALF ELF64_FSZ_HALF sizeof(Elf64_Half) ELT_T_OFF ELF64_FSZ_OFF sizeof(Elf64_Off) ELF_T_SWORD ELF64_FSZ_SWORD sizeof(Elf64_Sword) ELF_T_WORD ELF64_FSZ_WORD sizeof(Elf64_Word) returns zero if the value of type or ver is unknown. See elf_xlate(3E) for a list of the type values. SEE ALSO
elf(3E), elf_version(3E), elf_xlate(3E). elf_fsize(3E)
All times are GMT -4. The time now is 05:32 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy