i18n_intro(5) osf1 man page

i18n_intro(5)							File Formats Manual						     i18n_intro(5)

NAME
       i18n_intro,  i18n, LANG, LC_ALL, LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, LC_TIME - Introduction to internationalization
       (I18N)

DESCRIPTION
       Internationalization refers to the process of developing programs without prior knowledge of the language,  cultural  data,  or	character-
       encoding  schemes  that	the  programs  are  expected to handle. In other words, internationalization refers to the availability and use of
       interfaces that let programs modify their behavior at run time for operation in a specific language environment.  The abbreviation I18N	is
       often used to stand for internationalization as there are 18 characters between the beginning "I" and the ending "N" of that word.

       The I18N interfaces and utilities provided in Tru64 UNIX conform to Issue 4 of X/Open CAE specifications.

       A concept related to internationalization is localization (L10N), which refers to the process of establishing information within a computer
       system for each combination of native language, cultural data, and coded character set (codeset). A locale  is  a  database  that  provides
       information  for  a  unique combination of these three components. However, locales do not solve all of the problems that localization must
       address. Many native languages require additional support in the form of language-specific print filters, fonts, codeset converters,  char-
       acter input methods, and other kinds of specialized software.

       For  additional introductory information on topics related to internationalization, refer to the following reference pages: For more infor-
       mation on localization and locales For an introduction to codeset conversion For a summary of printer support for native languages

   Characters, Character Sets, and Codesets
       A character is a member of a set of elements used for the organization, control, or representation of data.

       A character set is a set of alphabetic or other characters used to construct the words and other elementary units of a native  language	or
       computer language.  A character set only specifies the characters that are included in the set.	ASCII, CNS 11643 and DTSCS are examples of
       character sets.

       A coded character set (codeset) is a set of unambiguous rules that support one or more character sets and establishes the one-to-one  rela-
       tionship  between each character and its bit representation. In other words, a codeset consists of the code points for characters in one or
       more character sets. For example, DEC Hanyu (dechanyu) is a codeset for Chinese and contains code points for characters in the  ASCII,  CNS
       11643-1986 (plane 1 and plane 2), and DTSCS character sets.

   Language Announcement (Setting Locale)
       Language  announcement is the mechanism by which language, cultural data, and codeset requirements are set either for the system as a whole
       or by individual users. An application can also set these requirements, although it is more common for an internationalized application	to
       use  the  setting in effect for the user who runs the program. Refer to the System Administration manual for information about setting sys-
       temwide defaults for shells. Refer to setlocale(3) and Writing Software for the International Market for information  on  how  applications
       query or set locale requirements at run time.

       Language  announcement  is  performed by setting one or more reserved environment variables to the name of an installed locale. Each locale
       has associated with it collating sequences, character conversion tables, character classification tables, formats for  different  kinds	of
       data,  and  message catalogs. If the same locale meets user requirements in all these categories, set only the LANG environment variable to
       the locale name. A locale name usually has the following format:

       language_territory.codeset[@modifier]

       The following Korn shell example sets LANG to a locale supporting the English language, United States cultural data, and ISO8859-1 codeset:
       $ LANG=en_US.ISO8859-1

       The following C shell example sets LANG to a locale supporting the Traditional Chinese language, Hong Kong cultural data, and the DEC Hanyu
       codeset: % setenv LANG zh_HK.dechanyu

       Note that locale name formats can vary from vendor to vendor. Use the locale -a command to display the names of locales installed  on  your
       system.	Refer to the l10n_intro(5) reference page for a list of the locales provided with the Tru64 UNIX product.

       An  alternative	way to set locale requirements for all locale categories is to set the LC_ALL environment variable. The difference between
       the LANG and LC_ALL variables is that LC_ALL is a high-precedence variable that overrides all other locale variables, including	LANG.  The
       LANG  variable,	on  the other hand, is a low-precedence variable.  When used by itself, the LANG variable implicitly sets all locale cate-
       gories to the specified locale just as LC_ALL does. However, the LANG variable can be used together with variables for specific locale cat-
       egories to create a multilocale environment.  The category-specific locale variables and what they control follow: String collation Charac-
       ter classification Translations for messages and valid strings for "yes" and "no" responses The currency symbol and the format of  monetary
       values The format of numeric values The format of date and time values

	      A locale can support only one set of date and time formats; however, there can be several sets of date and time formats in use for a
	      particular language and territory. See the l10n_intro(5) reference page for information about creating a site-specific version of  a
	      locale to support date and time formats different from those supported by an installed locale.

       Some  locale  names  have  one  or more @modifier suffixes. A locale with the suffix @ucs4 is for use by applications that require internal
       process code to be in UCS-4 format. See Unicode(5) for more information about UCS-4. Other @modifier suffixes indicate locale variants that
       support	alternative  rules  for  collation  in Asian languages. Use locales with these suffixes only when setting LC_COLLATE. For example,
       there are three different sets of collation rules (chuyin, radical, and stroke) that can be used with the  locale  supporting  the  Chinese
       language,  Taiwanese cultural data, and the Taiwanese EUC codeset. If Korn shell users want to use this locale, they might make the follow-
       ing settings: $ LANG=zh_TW.eucTW $ LC_COLLATE=zh_TW.eucTW@stroke

       The preceding example implicitly sets all locale category variables to zh_TW.eucTW, except for the LC_COLLATE variable,	which  is  set	to
       zh_TW.eucTW@stroke. The following locale command displays the variable settings after these assignments:

       $   locale   LANG=zh_TW.eucTW   LC_COLLATE=zh_TW.eucTW@stroke   LC_CTYPE="zh_TW.eucTW"  LC_MONETARY="zh_TW.eucTW"  LC_NUMERIC="zh_TW.eucTW"
       LC_TIME="zh_TW.eucTW" LC_MESSAGES="zh_TW.eucTW" LC_ALL=

SEE ALSO
       Commands: locale(1), setlocale(3)

       Others: i18n_printing(5), iconv_intro(5), l10n_intro(5), Unicode(5)

       Writing Software for the International Market

       System Administration

																     i18n_intro(5)
i18n_intro(5) osf1 man page | unix.com