extra character with iconv encoding Post: 302530936

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

iconv -l and ANSEL character set

I am forced to use the ANSEL character set for some GEDCOM documents but must convert them to a more modern set for another app which doesn't recognize ANSEL. I am unable to locate an ISO code for ANSEL in a search of the web. Would someone plese identify the ANSEL character set from the list given...

2. UNIX for Dummies Questions & Answers

character encoding in Fedora6

Hello, After upgrading the OS from Fedora4 to Fedora6, the firefox view>character encoding doesn't work anymore. None of the foreign characters can be displayed, no matter what character encoding to select. Any suggestions? Thanks, bsky :confused

3. AIX

Vacation message character encoding

I am trying to send a vacation message (.vacation.msg) from my AIX 5.3 server. Message is UTF-8 characters. Some email clients (like apple mail) have no problems displaying the correct text, however, some, like Windows Outlook, display garbage. Is there a way of forcing the client to use proper...

4. Shell Programming and Scripting

sort file adding extra character

HI all i have this script : #!/bin/bash sort /usr/tmp/"REPORT"$1 -o \ /usr/tmp/"SREPORT"$1 -k 1,7 -S 150 end of script now i'm doing this command : ls -lsgt *REPORT* 4 -rw-r--r-- 300 Sep 16 REPORT54784 4 -rw-r--r-- 301 Sep 16 SREPORT54784 as you can see the sorted file...

5. Shell Programming and Scripting

how to delete extra character in a line?

And I want to delete the characters longer than 20 for each line start with #. The other lines should remain the same. I think this can be done by sed. Could anyone help me with this? Thanks! my input file: #ZP_05494889.1_Clostridium_papyrosolvens...

6. Shell Programming and Scripting

Remove extra character

Hi I am using cat <filename> command in one of my datastage job(Command Activity). It is giving actual value but giving extra line. Eg: Displayed Output: 1 and showing extraline(Eg: 1 ) I had checked even wc -c it is giving one character extra. If the file contains 11. wc -c says 3. ...

7. HP-UX

how to find the character encoding of a file in hp_ux

8. Shell Programming and Scripting

Awk while-loop printing extra character

Hi, I'm using a while-loop in an awk script. If it matches a regular expression, it prints a line. Unfortunately, each line that is printed in this loop is followed by an extra character, "1". While-statement extracted from my script: getline temp; while (temp ~ /.* x .*/) print temp...

9. Solaris

connect to ILOM via ssh character encoding

Hello all, I am connecting to ILOM using ssh client (putty) but when RedHat start booting everything look chinese for me... Probably i have to configure the character set, i tried also utf-8 but the issue remain. Any idea? Thanks in advance

10. Shell Programming and Scripting

sed removing extra character from end

Hi, Searching through forum I found "sed 's/*$//'" can be used to remove trailing whitespaces and tabs from file. The command works fine but I see minor issue as below. Can you please suggest if I am doing something wrong here. $ cat a.txt upg_prod_test upg_prod_new $ cat a.txt |sed...

LEARN ABOUT SUNOS

wctype_ja

wctype_ja(3C)						   Standard C Library Functions 					     wctype_ja(3C)

NAME

       wctype_ja - Define a character class for the Japanese locale

SYNOPSIS

       #include <wchar.h>

       wctype_t wctype(const char *charclass);

DESCRIPTION

       wctype()  builds  values  in  wctype_t  data  type  according  to the specification with the charclass argument to determine wide character
       classes.  iswctype() is used for actual determination.  wctype() returns arguments that wctype() needs to use.

       The following character class names are defined in every locale.

       alnum	      alpha	     blank	    cntrl
       digit	      graph	     lower	    print
       punct	      space	     upper	    xdigit

       In addition to the above, the Japanese locale (ja, ja_JP.eucJP, ja_JP.PCK and ja_JP.UTF-8) defines the following character classes specific
       to the Japanese locale.

       jkanji	      jkata	     hira	    jdigit
       jparen	      line	     jisx0201r	    jisx0208
       jisx0212       udc	     vdc

       The following character classes are supported in ja and ja_JP.eucJP locales only.

       jalpha		   jspecial	  jgreek	 jrussian
       junit		   jsci 	  jgen		 jpunct

       The following character classes are supported in ja_JP.eucJP and ja_JP.UTF-8 locale only.

       ascii		   paren	       jisx0201
       gaiji		   jhankana	       jspace

       These  can be also used as charclass arguments to wctype().  However, the use of these classes are limited to applications for the Japanese
       locale only.

       upper	       Character class that represents any uppercase letter

		       JIS X 020Alphabetcuppercasegletterss(C/1-D/10)

		       JIS X 020Roman character uppercase letters (3/33-3/58)

				Greek character uppercase letters (6/1-24)

				Russian character uppercase letters (7/1-33)

		       JIS X 021Greek alphabet uppercase letters with diacritical marks (6/65-69, 71, 73, 74, 76)

				Cyrillic alphabet uppercase letters (7/34-46)

				Latin alphabet uppercase letters (9/1, 2, 4, 6, 8, 9, 11, 12, 13, 15, 16)

				Latin alphabet uppercase letters with diacritical marks (10/01-24, 26-87)

       lower	       Character class that represents any lowercase letter

		       JIS X 020Alphabetclowercasegletterss(E/1-F/10)

		       JIS X 020Roman character lowercase letters (3/65-90)

				Greek character lowercase letters (6/33-56)

				Russian character lowercase letters (7/49-81)

		       JIS X 021Greek alphabet lowercase letters with diacritical marks (6/81-92)

				Cyrillic alphabet lowercase letters (7/82-94)

				Latin alphabet lowercase letters (9/33-48)

				Latin alphabet lowercase letters with diacritical marks (11/1-27, 29-35, 37-87)

       digit	       Class that determines the numbers 0 to 10 for decimal representation.

		       JIS X 020Numbers (B/0-9)er graphic set

       space	       Class that determines a space.

		       JIS X 020Spacet(A/9-13)acter set

				Space characters

		       JIS X 020Space (1/1)

       punct	       Class that determines symbols and special characters.

		       JIS X 020A/1-15, B/10-C/0, D/11-E/0,eF/11-14

       cntrl	       Class that determines control characters.

		       JIS X 020Allocharactersacter set

		       Kill characters

		       C1 controAllhcharacters

       blank	       Class that determines field delimiters.

		       JIS X 020A/9ontrol character set

				Space characters

		       JIS X 020Space (1/1)

       xdigit	       Class that determines alphanumerics used for hexadecimal representation.

		       JIS X 020Numbers c(B/0-9)r graphic set

				A-F, a-f  (C/1-6, E/1-6)

       alpha	       Class that determines alphabets.

		       upper class and lower class letters

       print	       Class that determines printable characters.

		       JIS X 0201 Roman charactSpaceacharacters

		       JIS X 0201 Katakana charAllethercharacters except in character undefined areas

		       JIS X 0208	       All the characters except in character undefined areas

		       JIS X 0212	       All the characters except in character undefined areas

		       Vendor-defined characterAllethe characters except in character undefined areas in Class vdc.

		       User-defined character aAllsthe characters including character undefined areas in Class udc.

       graph	       Class that determines graphic characters.

		       All the characters in Class print except those in Class space.

       jkanji	       Class that determines Kanji (symbol or ideographic characters used for Kanji representation).

		       JIS X 020Character defined areas from Ku 16 to Ku 84.

		       JIS X 021Character defined areas from Ku 16 to Ku 77.

       jkata	       Class that determines Katakana.

		       JIS X 0205/1-86, 1/11, 12, 19, 20

       jhira	       Class that determines Hiragana.

		       JIS X 0204/1-83, 1/11, 12, 21, 22, 26

       jdigit	       Class that determines numbers except in digit.

		       JIS X 0203/16-25

       jparen	       Class that determines characters such as parentheses.

		       JIS X 0201/38-59

       line	       Class that determines ruled line primitives.

		       JIS X 0208/1-32

       jisx0201r       Class that determines characters included in JIS X 0201 Katakana character graphic set.

		       JIS X 020AllatheacharactersefromaA/1ctoeD/15.

       jisx0208        Class that determines characters included in JIS X 0208.

		       All the characters including those in JIS X 0208 character undefined areas: From Ku 1 to Ku 84 (Ku 13 Vendor-defined  char-
		       acter area is included).

       jisx0212        Class that determine characters included in JIS X 0212.

		       All the characters including those in JIS X 0212 character undefined areas: From Ku 1 to Ku 84 (Ku 83 and 84 Vendor-defined
		       character areas are also included).  No characters in ja_JP.PCK locale are included in this class.

       udc	       Class that determines user-defined characters.

		       All the characters including those in character undefined areas in the user-defined character area.

		       ja locale
				User-defined characters 0xf5a1-0xfefe

							0x8ff5a1-0x8ffefe

		       ja_JP.PCK locale
				User-defined characters 0xf040-0xf9fc

		       ja_JP.UTF-8 locale
				User-defined characters 0xe000-0xf8ffters)

       vdc	       Class that determines vendor-defined characters.

		       All the characters including those in character undefined areas in the vendor-defined character area.

		       ja and ja_JP.eucJISlXc0208 Ku 13: Special symbols

				       JIS X 0212 Ku 83 - 84

				       IBM Extended characters not included in JIS X 0212.

		       ja_JP.PCK localeJIS X 0208 Ku 13: Special symbols

				       NEC-selective IBM Extended characters 0xed40-0xeffc

				       IBM Extended characters: 0xfa40-0xfcfc

		       ja_JP.UTF-8 locaNot defined

       jalpha	       Class that determines alphabet letters.

		       JIS X 0203/33-58, 3/65-90

       jspecial        Class that determines special symbol characters.

		       JIS X 0201/2-94, 2/1-14, 2/26-33, 2/42-48, 2/60-74, 2/82-89, 94

		       JIS X 0212/15-25, 2/34-36, 2/75-81

		       JIS X 020IBMuExtendedccharacterss

				Special characters defined by NEC-selective IBM Extended characters

       jgreek	       Class that determines Greek characters.

		       JIS X 0206/1-24, 6/33-56

       jrussian        Class that determines Russian characters.

		       JIS X 0207/1-7/33, 7/49-81

       junit	       Class that determines unit symbols.

		       JIS X 0201/75-83, 2/82, 83

		       JIS X 0212/80

       jsci	       Class that detemines scientific symbols.

		       JIS X 0201/60-74, 2/26-33, 2/42-48, 2/60-74

       jgen	       Class that determines general symbols.

		       JIS X 0201/84-94, 2/1-14, 2/84-89, 94

		       JIS X 0212/35, 75, 2/79-81

       jpunct	       Class that determines punctuation symbols.

		       JIS X 0201/2-37

		       JIS X 0212/34, 36

       ascii	       Class that determines JIS X 0201 Functional character set, Space characters, Roman character graphic set, and Kill  charac-
		       ters.

       paren	       Class that determines characters such as parentheses.

       jisx0201        Class that determines characters included in JIS X 0212.

       gaiji	       Class that determines implementer defined characters. udc and vdc classes are included.

       jhankana        Class that determines characters used for Japanese representation included in JIS X 0212.

       jspace	       Class that determines space characters  included in JIS X 0208 and JIS X 0212.

       XX/YY in JIS X 0201 Functional character set, Roman character graphic set, and Katakana character graphic set denotes Column XX and Row YY.
       XX/YY in JIS X 0208 and JIS X 0212 denotes Ku XX and Point YY.

       In case of JIS X 0212 characters, this rule only applies to ja or ja_JP.UTF-8 locale.

EXAMPLES

       The following example shows how to determine if the wide character wc is included in Class udc.

	      iswctype(wc, wctype("udc"))

SEE ALSO

       iswctype(3C), wctype(3C), wctrans_ja(3C), jctype(3x), eucJP(5), PCK(5)

SunOS 5.10							    10 Jan 2003 						     wctype_ja(3C)

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

iconv -l and ANSEL character set

Discussion started by: Whiterock

2. UNIX for Dummies Questions & Answers

character encoding in Fedora6

Discussion started by: bsky

3. AIX

Vacation message character encoding

Discussion started by: lanny

4. Shell Programming and Scripting

sort file adding extra character

Discussion started by: naamas03