Home Man
Search
Today's Posts
Register

Linux & Unix Commands - Search Man Pages

OpenSolaris 2009.06 - man page for charmap (opensolaris section 5)

charmap(5)		       Standards, Environments, and Macros		       charmap(5)

NAME
       charmap - character set description file

DESCRIPTION
       A  character set description file or charmap defines characteristics for a coded character
       set. Other information about the coded character set may also be in the file. Coded  char-
       acter  set character values are defined using symbolic character names followed by charac-
       ter encoding values.

       The character set description file provides:

	   o	  The capability to describe character set attributes (such as collation order or
		  character  classes)  independent  of character set encoding, and using only the
		  characters in the portable character set. This  makes  it  possible  to  create
		  generic  localedef(1)  source  files	for  all codesets that share the portable
		  character set.

	   o	  Standardized symbolic names for all characters in the portable  character  set,
		  making it possible to refer to any such character regardless of encoding.

   Symbolic Names
       Each  symbolic  name   is  included  in	the file and is mapped to a unique encoding value
       (except for those symbolic names that are shown with identical  glyphs).  If  the  control
       characters  commonly  associated  with  the symbolic names in the following table are sup-
       ported by the implementation, the symbolic names and their corresponding  encoding  values
       are included in the file. Some of the encodings associated with the symbolic names in this
       table may be the same as characters in the portable character set table.

       +-----------------------------------------------------------------------+
       |<ACK>	    <DC2>	<ENQ>	    <FS>	 <IS4>	     <SOH>     |
       |<BEL>	    <DC3>	<EOT>	    <GS>	 <LF>	     <STX>     |
       |<BS>	    <DC4>	<ESC>	    <HT>	 <NAK>	     <SUB>     |
       |<CAN>	    <DEL>	<ETB>	    <IS1>	 <RS>	     <SYN>     |
       |<CR>	    <DLE>	<ETX>	    <IS2>	 <SI>	     <US>      |
       |<DC1>	    <EM>	<FF>	    <IS3>	 <SO>	     <VT>      |
       +-----------------------------------------------------------------------+

   Declarations
       The following declarations can precede the character definitions. Each must consist of the
       symbol shown in the following list, starting in column 1, including the surrounding brack-
       ets, followed by one or more blank characters, followed by the value to be assigned to the
       symbol.

       <code_set_name>	  The  name  of  the  coded  character	set  for  which the character set
			  description file is defined.

       <mb_cur_max>	  The maximum number of bytes in a multi-byte character. This defaults to
			  1.

       <mb_cur_min>	  An  unsigned	positive integer value that defines the minimum number of
			  bytes in a character for the encoded character set.

       <escape_char>	  The escape character used to indicate  that  the  characters	following
			  will be interpreted in a special way, as defined later in this section.
			  This defaults to backslash ('\'), which is the character glyph used  in
			  all the following text and examples, unless otherwise noted.

       <comment_char>	  The  character  that when placed in column 1 of a charmap line, is used
			  to indicate that the line is to be ignored. The  default  character  is
			  the number sign (#).

   Format
       The character set mapping definitions will be all the lines immediately following an iden-
       tifier line containing the string CHARMAP starting in column 1, and  preceding  a  trailer
       line  containing  the  string END CHARMAP starting in column 1. Empty lines and lines con-
       taining a <comment_char> in the first column will be ignored. Each non-comment line of the
       character  set  mapping	definition, that is, between the CHARMAP and END CHARMAP lines of
       the file), must be in either of two forms:

	 "%s %s %s\n",<symbolic-name>,<encoding>,<comments>

       or

	 "%s...%s %s %s\n",<symbolic-name>,<symbolic-name>, <encoding>,\
		  <comments>

       In the first format, the line in the character set mapping  definition  defines	a  single
       symbolic  name  and a corresponding encoding. A character following an escape character is
       interpreted as itself; for example, the sequence "<\\\>>"  represents  the  symbolic  name
       "\>" enclosed between angle brackets.

       In  the second format, the line in the character set mapping definition defines a range of
       one or more symbolic names. In this form, the symbolic names must consist of zero or  more
       non-numeric  characters,  followed by an integer formed by one or more decimal digits. The
       characters preceding the integer must be identical in the  two  symbolic  names,  and  the
       integer	formed by the digits in the second symbolic name must be equal to or greater than
       the integer formed by the digits in the first name. This is interpreted	as  a  series  of
       symbolic  names formed from the common part and each of the integers between the first and
       the second integer, inclusive. As an example, <j0101>...<j0104> is interpreted as the sym-
       bolic names <j0101>, <j0102>, <j0103>, and <j0104>, in that order.

       A  character set mapping definition line must exist for all symbolic names and must define
       the coded character value that corresponds to the character glyph indicated in the  table,
       or the coded character value that corresponds with the control character symbolic name. If
       the control characters commonly associated with the symbolic names  are supported  by  the
       implementation, the symbolic name and the corresponding encoding value must be included in
       the file. Additional unique symbolic names may be included. A coded character value can be
       represented by more than one symbolic name.

       The  encoding part is expressed as one (for single-byte character values) or more concate-
       nated decimal, octal or hexadecimal constants in the following formats:

	 "%cd%d",<escape_char>,<decimal byte value>

	 "%cx%x",<escape_char>,<hexadecimal byte value>

	 "%c%o",<escape_char>,<octal byte value>

   Decimal Constants
       Decimal constants must be represented by two or three  decimal  digits,	preceded  by  the
       escape character and the lower-case letter d; for example, \d05, \d97, or \d143. Hexadeci-
       mal constants must be represented by two hexadecimal digits, preceded by the escape  char-
       acter  and the lower-case letter x; for example, \x05, \x61, or \x8f. Octal constants must
       be represented by two or three octal digits, preceded by the escape character;  for  exam-
       ple, \05, \141, or \217. In a portable charmap file, each constant must represent an 8-bit
       byte. Implementations supporting other byte sizes may allow constants to represent  values
       larger  than  those that can be represented in 8-bit bytes, and to allow additional digits
       in constants. When constants are concatenated for multi-byte character values,  they  must
       be  of the same type, and interpreted in byte order from first to last with the least sig-
       nificant byte of the multi-byte character specified by the last constant.

   Ranges of Symbolic Names
       In lines defining ranges of symbolic names, the encoded value is the value for  the  first
       symbolic name in the range (the symbolic name preceding the ellipsis). Subsequent symbolic
       names defined by the range will have  encoding  values  in  increasing  order.  Bytes  are
       treated	as unsigned octets and carry is propagated between the bytes as necessary to rep-
       resent the range. However, because this causes a null byte in  the  second  or  subsequent
       bytes of a character, such a declaration should not be specified. For example, the line

	 <j0101>...<j0104>     \d129\d254

       is interpreted as:

	 <j0101>		\d129\d254
	 <j0102>		\d129\d255
	 <j0103>		\d130\d00
	 <j0104>		\d130\d01

       The expanded declaration of the symbol <j0103> in the above example is an invalid specifi-
       cation, because it contains a null byte in the second byte of a character.

       The comment is optional.

   Width Specification
       The following declarations can follow the character set	mapping  definitions  (after  the
       "END CHARMAP" statement). Each consists of the keyword shown in the following list, start-
       ing in column 1, followed by the value(s) to be associated  to  the  keyword,  as  defined
       below.

       WIDTH		A  non-negative integer value defining the column width for the printable
			character in the coded character set mapping definitions. Coded character
			set  character values are defined using symbolic character names followed
			by column width values. Defining a character with  more  than  one  WIDTH
			produces  undefined  results.  The END WIDTH keyword is used to terminate
			the WIDTH definitions. Specifying the width of a non-printable	character
			in a WIDTH declaration produces undefined results.

       WIDTH_DEFAULT	A  non-negative  integer  value defining the default column width for any
			printable character not listed by  one	of  the  WIDTH	keywords.  If  no
			WIDTH_DEFAULT  keyword	is included in the charmap, the default character
			width is 1.

       Example:

       After the "END CHARMAP" statement, a syntax for a width definition would be:

	 WIDTH
	 <A>		 1
	 <B>		 1
	 <C>...<Z>	 1
	 ...
	 <fool>...<foon> 2
	 ...
	 END WIDTH

       In this example, the numerical code point values represented by the symbols  <A>  and  <B>
       are assigned a width of 1. The code point values < C> to <Z> inclusive, that is, <C>, <D>,
       <E>, and so on, are also assigned a width of 1. Using <A>.  .  .<Z>  would  have  required
       fewer  lines,  but  the	alternative  was  shown  to  demonstrate flexibility. The keyword
       WIDTH_DEFAULT could have been added as appropriate.

SEE ALSO
       locale(1), localedef(1), nl_langinfo(3C), extensions(5), locale(5)

SunOS 5.11				    1 Dec 2003				       charmap(5)


All times are GMT -4. The time now is 06:27 AM.

Unix & Linux Forums Content Copyrightę1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password