localedef(4) [hpux man page]

localedef(4)						     Kernel Interfaces Manual						      localedef(4)

NAME

       localedef - format and semantics of locale definition file

DESCRIPTION

       This  is  a description of the syntax and meaning of the locale definition that is provided as input to the command to create a locale (see
       localedef(1M)).

       The following is a list of category tags, keywords and subsequent expressions which are recognized by The order of keywords within a  cate-
       gory  is  irrelevant  with the exception of the keyword and other exceptions noted under the description.  (Note that, as a convention, the
       category tags are composed of uppercase characters, while the keywords are composed of lowercase characters).

   Category Tags and Keywords
       The following keywords do not belong to any category and should appear in the beginning of the locale definition file:

	      Single character indicating the character
		     to be interpreted as starting a comment line within the locale definition file.  This character should be in the first column
		     of a comment line.  The default comment_char is All lines with a comment_char in the first column are ignored.

	      A single character indicating the character
		     to  be  interpreted  as  an  escape  character  within  the script.  The default escape_char is escape_char is used to escape
		     localedef metacharacters to remove special meaning and in the character constant decimal, octal, and hexadecimal formats.	It
		     is  also used to continue a line onto the next, if escape_char is the last character on the line (before the new-line charac-
		     ter).

       The following keywords can be used in any category:

	      A string naming another valid locale available on the system.
		     This causes the category in the locale being created to be a copy of the same category in the named locale.  Since  the  key-
		     word defines the entire category, if used, it must be the only keyword in the category.

       The following six categories are recognized:

       This category defines character classification, case conversion and other
	      character attributes.  The following predefined character classifications are recognized:

		 Character codes classified as uppercase letters. Characters specified
				in the or classifications cannot be specified in this category.

		 Character codes classified as lowercase letters. Same restrictions
				applicable to the category apply to this classification.

		 Character codes classified as numeric. Only ten characters in contiguous
				ascending sequence by numerical value can be specified. Alternative digits cannot be specified here.

		 Character codes classified as white-space. No character specified for
				the or categories can be included in this classification.

		 Character codes classified as punctuation characters.	No character
				included in the or categories can be specified.

		 Character codes classified as control characters. No character included in
				the or can be included here.

		 Character codes classified as blank characters. The <space> and
				<tab> characters are automatically included.

		 Character codes classified as hexadecimal digits. Only the characters
				defined for the class can be specified, followed by one or more sets of six characters, with each set in ascending
				order.

		 Character codes classified as letters. Characters classified as
				or cannot be specified. Characters specified as and classes are automatically included in this class.

		 Character codes classified as printable characters.
				Characters specified for and classes and the <space> character are automatically included. No character  from  the
				category can be specified.

		 Character codes classified as printable characters,
				except the <space> character.  In all other respect this classification is similar to the category.

	      The following two are special classifications, used to designate valid first-of-two and second-of-two Note that these are byte clas-
	      sifications and not character classifications; hence, they cannot be used with the iswctype interface (see wctype(3C)), in the  same
	      manner as the other classifications can be used.

		 Valid first bytes of two-byte characters.

		 Valid second bytes of two-byte characters.

	      Character case conversion definitions:

		 Lowercase to uppercase character relationships.

		 Uppercase to lowercase character relationships.

	      Miscellaneous character attribute and classifications:

		 String mapped into the ASCII
				equivalent string ``b!"#$%&'()*+,-./:;<=>?@[]^_`{}~'', where b is a blank (a langinfo(5) item).

		 Defines one or more locale-specific character class names as
				strings  separated  by semicolons.  Each named character class can then be defined subsequently in the definition.
				The first character of a character class name must be a letter and the class name cannot match any of  the  prede-
				fined classifications (for example,

		 String operand indicates text direction (a
				langinfo(5) item). String operand "1" indicates right-to-left text direction.

		 String operand indicates character context analysis. String "1"
				indicates Arabic context analysis is required.

       The    category provides collation sequence definition for relative ordering between collating elements (single and multi-character collat-
	      ing elements) in the locale.  The following keywords belong to this category and should come between the category tag and The  first
	      two keywords can be in any order, but must come before the keyword.  Any number of the first two keywords can be specified.

		 Defines a multi-character collating element,
				symbol, composed of the characters in string.  String is limited to two characters.

		 Makes		symbol	a  collating symbol which can be used to define a place in the collating sequence.  Symbol does not repre-
				sent any actual character.

		 Denotes the start of the collation sequence.
				The directives have an effect on string collation.

				The lines following the keyword and before the keyword contain collating element entries, one per line.

				Operands can optionally appear after the keyword to defined rules for string comparison  using	a  multiple-weight
				scheme (if no operands are specified, a single operand is assumed).  The possible operands are:

				Specifies that comparison operations proceed from start of string towards
					  the end of it.

				Specifies that comparison operations proceed from end of string towards
					  the beginning of it.

		 Marks the end of the list of collating element entries.

       The    category	defines  the rules and symbols used to format monetary numeric information. The following keywords belong to this category
	      and should come between the category tag and

		 The operand is a four-character string used to designate the international
				currency symbol.  The first three characters should contain the alphabetic international currency symbol in accor-
				dance  with  those specified in the ISO 4217 standard.	The fourth character is the character used to separate the
				international currency symbol from the monetary quantity.

		 The operand is a string used as the local currency symbol.

		 The operand is a string containing the symbol used as the decimal
				delimiter (radix character).

		 The operand is a string containing the symbol used as a separator for
				groups of digits to the left of decimal delimiter.

		 The operand is a semicolon-separated list of integers.
				The initial integer defines the size of the group immediately preceding the decimal delimiter, and  the  following
				integers define the preceding groups.  If the last integer is not -1, then the size of the previous group (if any)
				will be repeatedly used for the remainder of the digits.  If the last integer is -1, then no further grouping will
				be performed.

		 The operand is a string to indicate a non-negative monetary quantity.

		 The operand is a string to indicate a negative monetary quantity.

		 The operand is an integer representing the number of fractional digits
				used in formatted monetary values using

		 The operand is an integer representing the number of fractional digits
				used in formatted monetary values using

		 The operand is an integer which if set to 1 indicates the
				precedes a monetary quantity, and if set to 0 the symbol succeeds the value.

		 The operand is an integer which indicates the separation of the
				the sign string, and the value for a non-negative formatted monetary quantity.

				The value of and are interpreted according to the following:

				  No space separates the currency symbol and value.

				  If the currency symbol and sign string are adjacent, a space separates
					 them from the value; otherwise, a space separates the currency symbol from the value.

				  If the currency symbol and sign string are adjacent, a space separates them;
					 otherwise, a space separates the sign string from the value.

		 The operand is an integer which if set to 1 indicates the
				precedes a negative monetary quantity, and if set to 0 the symbol succeeds the negative value.

		 The operand is an integer which indicates the separation of the
				the sign string, and the value for a negative formatted monetary quantity.

		 The operand is an integer which indicates the positioning of the
				for a positive monetary quantity.  The possible values are:

				  Parenthesis surround the quantity and the
					 or

				  The sign string precedes the quantity and the
					 or

				  The sign string succeeds the quantity and the
					 or

				  The sign string precedes the
					 or

				  The sign string succeeds the
					 or

		 The operand is an integer set to a value indicating the positioning of
				the negative_sign for a negative formatted monetary quantity.

		 The operand is an integer which if set to 1 indicates the
				precedes a monetary quantity, and if set to 0 the symbol succeeds the value.

		 The operand is an integer which indicates the separation of the
				the sign string, and the value for a non-negative internationally formatted monetary quantity.

		 The operand is an integer which if set to 1 indicates the
				precedes a negative monetary quantity, and if set to 0 the symbol succeeds the negative value.

		 The operand is an integer which indicates the separation of the
				the sign string, and the value for a negative internationally formatted monetary quantity.

		 The operand is an integer which indicates the positioning of the
				for a positive monetary quantity formatted with the international format.

		 The operand is an integer which indicates the positioning of the
				for a negative monetary quantity formatted with the international format.

       The    category	defines rules and symbols used to format non-monetary numeric information.  The following keywords belong to this category
	      and should come between the category tag and

		 The operand is a string containing the symbol used as the decimal
				delimiter (radix character) in numeric, non-monetary formatted quantities.  This keyword  cannot  be  omitted  and
				cannot be set to the empty string.

		 The operand is a string containing the symbol used as a separator
				for groups of digits to the left of the decimal delimiter.

		 The operand is a semicolon-separated list of integers.
				The  initial  integer defines the size of the group immediately preceding the decimal delimiter, and the following
				integers define the preceding groups.  If the last integer is not -1, then the size of the previous group (if any)
				will  be repeatedly used for the remainder of the digits. If the last integer is -1, then no further grouping will
				be performed.

		 String mapped into the ASCII
				equivalent string "", where b is a blank (a langinfo(5) item).	The keyword is an HP extension to the POSIX  stan-
				dards and it has a different meaning than the defined in POSIX standards.

       The    category	defines  the rules for generating locale-specific formatted date strings.  The following mandatory keywords belong to this
	      category and should come between the category tag and

		 Seven semicolon-separated strings
				giving abbreviated names for the days of the week beginning with Sunday.

		 Seven semicolon-separated strings
				giving full names for the days of the week beginning with Sunday.

		 Twelve semicolon-separated strings giving abbreviated names for the months,
				beginning with January.

		 Twelve semicolon-separated strings giving full names for the months,
				beginning with January.

		 The operand is a string defining the appropriate date and time
				representation.

		 The operand is a string defining the appropriate date
				representation.

		 The operand is a string defining the appropriate time
				representation.

		 The operand is two semicolon-separated strings giving
				the representations for and

		 The operand is a string defining the appropriate time representation
				in the 12-hour clock format with

		 The operand is a semi-colon-separated list of strings. Each string
				defines the name and date of an era or emperor for a locale. Each string should conform to the following format:

				direction:offset:start_date:end_date:name:format

				where:

				     direction	 Either a or character.  The character indicates the time axis should be such that the years count
						 in  the positive direction when moving from the starting date towards the ending date.  The char-
						 acter indicates the time axis should be such that the years count in the negative direction  when
						 moving from the starting date towards the ending date.

				     offset	 A number in the range indicating the number of the first year of the era.

				     start_date  A  date  in the form where yyyy, mm, and dd are the year, month and day numbers, respectively, of
						 the start of the era.	Years prior to the year 0 A.D.	are represented as negative numbers.   For
						 example,  an  era  beginning March 5th in the year 100 B.C.  would be represented as Years in the
						 range are supported.

				     end_date	 The ending date of the era in the same form as the start_date above or one  of  the  two  special
						 values  or A value of indicates the ending date of the era extends to the beginning of time while
						 indicates it extends to the end of time.  The ending date can be chronologically either before or
						 after	the starting date of an era.  For example, the expressions for the Christian eras A.D. and
						 B.C. would be:

				     name	 A string representing the name of the era which is substituted for  the  directive  of  and  (see
						 date(1) and strftime(3C)).

				     format	 A  string for formatting the directive of and This string is usually a function of the and direc-
						 tives.  If format is not specified, the string specified for the category keyword (see below)	is
						 used as a default.

		 The operand is a string defining the format of date in era notation.

		 The operand is a string defining the format of time in era notation.

		 The operand is a string defining the format of date and
				time in era notation.

		 The operand is a semi-colon-separated list of strings. The first
				string	is the alternative symbol corresponding to zero, the second string is the alternative symbol corresponding
				to one, and so on.  Note that if the HP-UX-proprietary keyword has been specified in the same  locale,	the  first
				ten symbols should be identical for these two keywords.

	      In addition to the above, the following HP-UX-proprietary keywords are recognized (these are provided for backward compatibility and
	      their use is otherwise not recommended):

       The    category defines the format and values for affirmative and negative responses.  The following keywords belong to this  category  and
	      should come between the category tag and

		 The string operand is
				an Extended Regular Expression matching acceptable affirmative responses to yes/no queries.

		 The string operand is
				an Extended Regular Expression matching acceptable negative responses to yes/no queries.

		 The string operand identifies the affirmative response for yes/no questions.
				This keyword is now obsolete and should be used instead.

		 The string operand identifies the negative response for yes/no questions
				This keyword is now obsolete and should be used instead.

   Keyword Operands
       Keyword operands consist of character-code constants and symbols, strings, and metacharacters.  The types of legal expressions are: and

	      operands consist of single character-code constants or symbolic names
			separated  by semicolons, or a character-code range consisting of a constant or symbolic name followed by an ellipsis fol-
			lowed by another constant or symbolic name.  The constant preceding the ellipsis must have a smaller code value  than  the
			constant  following  the ellipsis.  A range represents a set of consecutive character codes.  If the list is longer than a
			single line, the escape character must be used at the end of each line as a continuation character.  It is an error to use
			any symbolic name that is not defined in an accompanying charmap file (see charmap(4)).

	      operands	consist of strings separated by semicolons.  If longer than one line, the escape character must be used for continuation.

	      operands consist of a sequence of zero or more characters
			surrounded  by	double	quotes	(").  Within a string, the double-quote character must be preceded by an escape character.
			The following escape sequences also can be used:

			newline

			horizontal tab

			backspace

			carriage return

			form feed

			backslash

			single quote

			bit pattern

				The escape consists of the escape character followed by 1, 2, or 3  octal  digits  specifying  the  value  of  the
				desired character (for other possible bit pattern specification, see below).  Also, an escape character () and an
				immediately-following newline are ignored.

			Although the backslash () has been used for illustration, another escape character can be substituted by the keyword.

	      Constants represent character codes in the operands.
			They can be used in the following forms:

			decimal constants      An escape character followed by a followed by up to three decimal digits.

			octal constants        An escape character followed by up to three octal digits.

			hexadecimal constants  An escape character followed by a followed by two hexadecimal digits.

			Unicode constants      An escape character followed by a followed by four to eight hexadecimal digits  which  specifies  a
					       Unicode scalar value in a charmap file to be used with the option of the command.

			character constants    A  single  character  (for example, A) having the numerical value of the character in the machine's
					       character set.

			symbolic names	       A string enclosed between and is a symbolic name.   input  files  are  recommended  to  be  written
					       entirely  in  symbolic  names, utilizing a user defined or system-supplied charmap file.  This aids
					       portability of input files between different encoded character sets (see charmap(4)).

					       Symbolic names can be defined within a locale definition file by the and keywords.  These  are  not
					       character  constants.  It is an error if such an internally defined symbolic name collides with one
					       defined in a charmap file.

	      operands	consists of one or more decimal digits separated by semicolons.

	      operands follow keywords
			and and must consist of two character-code constants enclosed by left and right parentheses  and  separated  by  a  comma.
			Each  such character pair is separated from the next by a semicolon.  For the first constant represents an uppercase char-
			acter and the second the corresponding lowercase character.  For the first constant represents an lowercase character  and
			the second the corresponding uppercase character.

	      The	keyword  is  followed by collating element entries, one per line, in ascending order by collating position.  The collating
			element entries have the form:

			collation_element can be a character, a collating symbol enclosed in angle brackets representing a character or  collating
			element, the special symbol or an ellipsis

			A  character  stands  for  itself;  a  collating  symbol can be a symbolic name for a character that is interpreted by the
			charmap file, a multi-character collating element defined by a keyword, or a collating symbol defined by the

			The special symbol specifies the collating position of any characters not explicitly defined by collating element entries.
			For  example,  if some group of characters is to be omitted from the collation sequence and just collate after all defined
			characters, a collating symbol might be defined before the keyword:

			Then somewhere in the list of collating element entries:

			Notice that there is no second weight.	This means that on a second pass all characters collate by their encoded value.

			An ellipsis is interpreted as a list of characters with an encoded value higher than that of the character on the  preced-
			ing  line  and	lower than that on the following line.	Because it is tied to encoded value of characters, the ellipsis is
			inherently non-portable.  If it is used, a warning is issued and no output generated unless the option was given.

			The weight operands provide information about how the collating element is to be collated on first and subsequent  passes.
			Weight	can be a two-character string, the special symbol or a collating element of any of the forms specified for collat-
			ing_element except If there are no weights, the character is collating strictly by its position in the list.  If there	is
			only one weight given, the character sorts by its relative position in the list on the second collation pass.

			An  equivalence  class	is defined by a series of collating element entries all having the same character or symbol in the
			first weight position.	For example, in many locales all forms of the character 'A' collate equal on the first pass.  This
			is represented in the collating element entries as:

			Two-to-one collating elements are specified by collating-elements defined before the keyword.  For example, the two-to-one
			collating element in Spanish, would be defined before the keyword as

			It would then be used in a collating element entry as

			A one-to-two collating element is defined by having a two-character string in one of the weight positions.   For  example,
			if the character collates equal to the pair "AE", the collating element entry would be:

			A  don't-care character is defined by the special symbol For example, the dash character, may be a don't care on the first
			collation pass.  The collating element entry is:

			Symbols defined by the keyword can be used to indicate that a given character collates higher or lower than some  position
			in  the  sequence.   For  example  if all characters with an encoded value less than that of are to collate lower than all
			other characters on the first pass, and in relative order on the second pass, define a collating symbol  before  the  key-
			word:

			The first two collating element entries are then:

			This  also  illustrates the use of the ellipsis to indicate a range.  The first ellipsis is interpreted as "all characters
			in the encoded character set with a value lower than '0'"; the second ellipsis means that  all	characters  in	the  range
			defined by the first collate in relative order.

	      operands conform to
			the Extended Regular Expressions specifications as described in regexp(5).

   Metacharacters
       Metacharacters  are  characters having a special meaning to localedef in operands.  To escape the special meaning of these characters, sur-
       round them with single quotes or precede them by an escape character.  localedef meta-characters include:

	      Indicates the beginning of a symbolic name.

	      Indicates the end of a symbolic name.

	      Indicates the beginning of a character shift pair following the
		      and keywords.

	      Indicates the end of a character shift pair.

	      Used to separate the characters of a character shift pair.

	      Used to quote strings.

	      Used as a separator in list operands.

	      escape character
		      Used to escape special meaning from other metacharacters and itself.  It is backslash () by default, but can  be  redefined
		      by the keyword.

   Comments
       Comments  are  lines  beginning	with a comment character.  The comment character is pound sign (#) by default, but can be redefined by the
       keyword.  Comments and blank lines are ignored.

   Separators
       Separator characters include blanks and tabs.  Any number of separators can be used to delimit the keywords, metacharacters, constants  and
       strings	that  comprise	a localedef script except that all characters between and are considered to be part of the symbolic name even they
       are <blank>s.

EXAMPLES

       Please see the files under for examples of locale description files.  These files were used to create the various locales which are  deliv-
       ered with HP-UX.

																      localedef(4)
Linux and UNIX Man Pages

localedef(4) [hpux man page]