u8_textprep_str(3c) opensolaris man page

u8_textprep_str(3C)					   Standard C Library Functions 				       u8_textprep_str(3C)

NAME
       u8_textprep_str - string-based UTF-8 text preparation function

SYNOPSIS
       #include <sys/u8_textprep.h>

       size_t u8_textprep_str(char *inarray, size_t *inlen,
	    char *outarray, size_t *outlen, int flag,
	    size_t unicode_version, int *errnum);

PARAMETERS
       inarray		   A pointer to a byte array containing a sequence of UTF-8 character bytes to be prepared.

       inlen		   As  input  argument,  the number of bytes to be prepared in inarray. As output argument, the number of bytes in inarray
			   still not consumed.

       outarray 	   A pointer to a byte array where prepared UTF-8 character bytes can be saved.

       outlen		   As input argument, the number of available bytes at outarray where prepared character bytes can be  saved.	As  output
			   argument, after the conversion, the number of bytes still available at outarray.

       flag		   The possible preparation options constructed by a bitwise-inclusive-OR of the following values:

			   U8_TEXTPREP_IGNORE_NULL

			       Normally  u8_textprep_str()  stops the preparation if it encounters null byte even if the current inlen is pointing
			       to a value bigger than zero.

			       With this option, null byte does not stop the preparation and  the  preparation	continues  until  inlen  specified
			       amount of inarray bytes are all consumed for preparation or an error happened.

			   U8_TEXTPREP_IGNORE_INVALID

			       Normally u8_textprep_str() stops the preparation if it encounters illegal or incomplete characters with correspond-
			       ing errnum values.

			       When this option is set, u8_textprep_str() does not stop the preparation and instead treats such characters  as	no
			       need to do any preparation.

			   U8_TEXTPREP_TOUPPER

			       Map lowercase characters to uppercase characters if applicable.

			   U8_TEXTPREP_TOLOWER

			       Map uppercase characters to lowercase characters if applicable.

			   U8_TEXTPREP_NFD

			       Apply Unicode Normalization Form D.

			   U8_TEXTPREP_NFC

			       Apply Unicode Normalization Form C.

			   U8_TEXTPREP_NFKD

			       Apply Unicode Normalization Form KD.

			   U8_TEXTPREP_NFKC

			       Apply Unicode Normalization Form KC.

			   Only one case folding option is allowed. Only one Unicode Normalization option is allowed.

			   When a case folding option and a Unicode Normalization option are specified together, UTF-8 text preparation is done by
			   doing case folding first and then Unicode Normalization.

			   If no option is specified, no processing occurs except the simple copying of bytes from input to output.

       unicode_version	   The version of Unicode data that should be used during UTF-8 text preparation. The following values are supported:

			   U8_UNICODE_320

			       Use Unicode 3.2.0 data during comparison.

			   U8_UNICODE_500

			       Use Unicode 5.0.0 data during comparison.

			   U8_UNICODE_LATEST

			       Use the latest Unicode version data available which is Unicode 5.0.0 currently.

       errnum		   The error value when preparation is not completed or fails. The following values are supported:

			   E2BIG     Text preparation stopped due to lack of space in the output array.

			   EBADF     Specified option values are conflicting and cannot be supported.

			   EILSEQ    Text preparation stopped due to an input byte that does not belong to UTF-8.

			   EINVAL    Text preparation stopped due to an incomplete UTF-8 character at the end of the input array.

			   ERANGE    The specified Unicode version value is not a supported version.

DESCRIPTION
       The u8_textprep_str() function prepares the sequence of UTF-8 characters in the array specified by inarray into a sequence of corresponding
       UTF-8  characters prepared in the array specified by outarray. The inarray argument points to a character byte array to the first character
       in the input array and inlen indicates the number of bytes to the end of the array to be converted. The outarray argument points to a char-
       acter  byte  array to the first available byte in the output array and outlen indicates the number of the available bytes to the end of the
       array. Unless flag is U8_TEXTPREP_IGNORE_NULL, u8_textprep_str() normally stops when it encounters a null byte from the input array regard-
       less of the current inlen value.

       If flag is U8_TEXTPREP_IGNORE_INVALID and a sequence of input bytes does not form a valid UTF-8 character, preparation stops after the pre-
       vious successfully prepared character. If flag is U8_TEXTPREP_IGNORE_INVALID and the input array ends with an incomplete  UTF-8	character,
       preparation stops after the previous successfully prepared bytes. If the output array is not large enough to hold the entire prepared text,
       preparation stops just prior to the input bytes that would cause the output array to overflow. The value pointed to by inlen is decremented
       to  reflect  the number of bytes still not prepared in the input array. The value pointed to by outlen is decremented to reflect the number
       of bytes still available in the output array.

RETURN VALUES
       The u8_textprep_str() function updates the values pointed to by inlen and outlen arguments to reflect the extent of the	preparation.  When
       U8_TEXTPREP_IGNORE_INVALID  is  specified,  u8_textprep_str()  returns the number of illegal or incomplete characters found during the text
       preparation. When U8_TEXTPREP_IGNORE_INVALID is not specified and the text preparation is entirely successful, the function returns  0.	If
       the  entire  string  in the input array is prepared, the value pointed to by inlen will be 0. If the text preparation is stopped due to any
       conditions mentioned above, the value pointed to by inlen will be non-zero and errnum is set to indicate the error. If such and	any  other
       error occurs, u8_textprep_str() returns (size_t)-1 and sets errnum to indicate the error.

EXAMPLES
       Example 1 Simple UTF-8 text preparation

	 #include <sys/u8_textprep.h>
	 .
	 .
	 .
	 size_t ret;
	 char ib[MAXPATHLEN];
	 char ob[MAXPATHLEN];
	 size_t il, ol;
	 int err;
	 .
	 .
	 .
	 /*
	  * We got a UTF-8 pathname from somewhere.
	  *
	  * Calculate the length of input string including the terminating
	  * NULL byte and prepare other arguments.
	  */
	 (void) strlcpy(ib, pathname, MAXPATHLEN);
	 il = strlen(ib) + 1;
	 ol = MAXPATHLEN;

	 /*
	  * Do toupper case folding, apply Unicode Normalization Form D,
	  * ignore NULL byte, and ignore any illegal/incomplete characters.
	  */
	 ret = u8_textprep_str(ib, &il, ob, &ol,
	     (U8_TEXTPREP_IGNORE_NULL|U8_TEXTPREP_IGNORE_INVALID|
	     U8_TEXTPREP_TOUPPER|U8_TEXTPREP_NFD), U8_UNICODE_LATEST, &err);
	 if (ret == (size_t)-1) {
	     if (err == E2BIG)
		 return (-1);
	     if (err == EBADF)
		 return (-2);
	     if (err == ERANGE)
		 return (-3);
	     return (-4);
	 }

ATTRIBUTES
       See attributes(5) for descriptions of the following attributes:

       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE	     |	    ATTRIBUTE VALUE	   |
       +-----------------------------+-----------------------------+
       |Interface Stability	     |Committed 		   |
       +-----------------------------+-----------------------------+
       |MT-Level		     |MT-Safe			   |
       +-----------------------------+-----------------------------+

SEE ALSO
       u8_strcmp(3C), u8_validate(3C), attributes(5), u8_strcmp(9F), u8_textprep_str(9F), u8_validate(9F)

       The Unicode Standard (http://www.unicode.org)

NOTES
       After  the  text  preparation, the number of prepared UTF-8 characters and the total number bytes may decrease or increase when you compare
       the numbers with the input buffer.

       Case conversions are performed using Unicode data of the corresponding version. There are no locale-specific case conversions that  can	be
       performed.

SunOS 5.11							    18 Sep 2007 					       u8_textprep_str(3C)
u8_textprep_str(3c) opensolaris man page | unix.com