10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi
I have a big file which is in ansii . I want to convert it to UTF-16 .Please help me on this as I am stuck at this point in unix . (8 Replies)
Discussion started by: harry00514
8 Replies
2. Linux
Hi,
I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine
unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt
and i am getting some chinese characters as below which l opened the converted file on windows machine.
LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies
3. Shell Programming and Scripting
I Am trying to change the file encoding from ASCII to UTF-8 using below command
iconv -f ASCII -t UTF-8 <input_file> > <output_file>
But the output_file is not actually in UTF-8 format. If I use the file command to check the file encoding it still says ASCII.
While converting am not... (5 Replies)
Discussion started by: Sriranga
5 Replies
4. UNIX for Dummies Questions & Answers
I need to use sort, uniq, grep, wc,... and the like to work with lists of words in UTF-8 (the "words" being phonetic transcriptions using the IPA). I have been using Google a lot and I even found at least one previous post on this topic, but it didn't help.
I tried following the instructions... (2 Replies)
Discussion started by: mregine
2 Replies
5. Programming
My OS (Debian) and gcc use the UTF-8 locale. This code says that the char size is 1 byte but the size of 'a' is really 4 bytes.
int main(void)
{
setlocale(LC_ALL, "en_US.UTF-8");
printf("Char size: %i\nSize of char 'a': %i\nSize of Euro sign '€': %i\nLength of Euro sign: %i\n",... (8 Replies)
Discussion started by: cyler
8 Replies
6. AIX
please someone provide me the link for downloading en_us.utf-8 .....i have an issue with locale for which i need this :( (1 Reply)
Discussion started by: shubhendu.pyne
1 Replies
7. UNIX for Dummies Questions & Answers
Hi all, dummy here.... I have major errors on entering the shell. On login I get:
-bash: dircolors: command not found
-bash: tr: command not found
-bash: fgrep: command not found
-bash: grep: command not found
-bash: grep: command not found
-bash: id: command not found
-bash: [: =: unary... (12 Replies)
Discussion started by: wcmmlynn
12 Replies
8. Shell Programming and Scripting
Collegues
I tried to manipulate a UTF 8 data using the following script.
cat $1 | sed 's/ലായി$/ലായി LAYI/g' | sed 's/ുടെ/ുടെ UTE/g' | sed 's/യില്*/യില്* YIL/g'
But it says that cnot exicute binary file. Any solution.
Jaganadh.
Linguist (1 Reply)
Discussion started by: jaganadh
1 Replies
9. AIX
$ errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
3074FEB7 0802050205 T H fscsi1 ADAPTER ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
3074FEB7 0802050205 T H fscsi0 ADAPTER ERROR
B8113DD1 ... (2 Replies)
Discussion started by: mcastill66
2 Replies
10. UNIX for Advanced & Expert Users
$ errpt | more
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
3074FEB7 0802050205 T H fscsi1 ADAPTER ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
B8113DD1 0802050205 T H fcs1 LINK ERROR
3074FEB7 0802050205 T H fscsi0 ADAPTER ERROR
B8113DD1 ... (0 Replies)
Discussion started by: mcastill66
0 Replies
MBSINIT(3) Linux Programmer's Manual MBSINIT(3)
NAME
mbsinit - test for initial shift state
SYNOPSIS
#include <wchar.h>
int mbsinit(const mbstate_t *ps);
DESCRIPTION
Character conversion between the multibyte representation and the wide character representation uses conversion state, of type mbstate_t.
Conversion of a string uses a finite-state machine; when it is interrupted after the complete conversion of a number of characters, it may
need to save a state for processing the remaining characters. Such a conversion state is needed for the sake of encodings such as ISO-2022
and UTF-7.
The initial state is the state at the beginning of conversion of a string. There are two kinds of state: The one used by multibyte to wide
character conversion functions, such as mbsrtowcs(3), and the one used by wide character to multibyte conversion functions, such as wcsr-
tombs(3), but they both fit in a mbstate_t, and they both have the same representation for an initial state.
For 8-bit encodings, all states are equivalent to the initial state. For multibyte encodings like UTF-8, EUC-*, BIG5 or SJIS, the wide
character to multibyte conversion functions never produce non-initial states, but the multibyte to wide-character conversion functions like
mbrtowc(3) do produce non-initial states when interrupted in the middle of a character.
One possible way to create an mbstate_t in initial state is to set it to zero:
mbstate_t state;
memset(&state,0,sizeof(mbstate_t));
On Linux, the following works as well, but might generate compiler warnings:
mbstate_t state = { 0 };
The function mbsinit() tests whether *ps corresponds to an initial state.
RETURN VALUE
mbsinit() returns nonzero if *ps is an initial state, or if ps is a NULL pointer. Otherwise it returns 0.
CONFORMING TO
C99.
NOTES
The behavior of mbsinit() depends on the LC_CTYPE category of the current locale.
SEE ALSO
mbsrtowcs(3), wcsrtombs(3)
COLOPHON
This page is part of release 3.53 of the Linux man-pages project. A description of the project, and information about reporting bugs, can
be found at http://www.kernel.org/doc/man-pages/.
GNU
2000-11-20 MBSINIT(3)