Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

iconv_open(3c) [hpux man page]

iconv(3C)																 iconv(3C)

NAME
iconv(), iconv_open(), iconv_close() - codeset conversion routines SYNOPSIS
DESCRIPTION
The routine uses the following configuration files in descending order of precedence: o o The system file is searched first. It cannot be modified and contains codeset names supported by the operating system. The file, on the other hand, is user-modifiable and can be used by the system adminstrator or third-party applications to add custom iconv converters. The configuration files are divided into two sections. The first section is for defining aliases to the canonical codeset names used in the second section. It is ended by the keyword The second section contains the set of conversions (codeset names) that are supported by The first two columns correspond to the fromcode and tocode names. These names may be directly used or their corresponding aliases may be used as parameters to The remaining three columns corresponds to the name of the translation table, the iconv method, and the corresponding function name and method library (for multi-byte codeset). The symbol is used if the columns are not applicable. The and method library suffices in these configuration files, if present, are now redundant. They are ignored by the function, and the right architecture specific method library extension will be appended automatically. Returns a conversion descriptor that describes a conversion from the codeset specified by the string pointed to by the fromcode argument to the codeset specified by the tocode argu- ment. A conversion descriptor remains valid in a process until that process closes it. The fromcode and tocode arguments must have a corresponding entry in at least one of the iconv configuration files. The function searches the codeset names first in and then in to check if the requested conversion is supported. If so, determines which table and/or method to use for the conversion. Converts a sequence of characters from one codeset that is contained in the array specified by inbuf, into a sequence of corresponding characters in another codeset, con- tained in the array specified by outbuf. The codesets are those specified in the call that returned the conversion descriptor cd. The inbuf argument points to a variable that points to the first character in the input buffer, and inbytesleft indicates the number of remaining bytes in the buffer being converted. The outbuf argument points to a variable that points to the first available byte in the output buffer, and outbytesleft indicates the number of the available remaining bytes in the buffer. If a sequence of input bytes does not form a valid character in the specified codeset, conversion stops after the pre- vious successfully converted character. If the input buffer ends with an incomplete character or shift sequence (see section), conversion stops after the previous successfully converted character. If the output buffer is not large enough to hold the entire converted output, conversion stops just prior to the character that would cause the output buffer to overflow. The variable pointed to by inbuf is updated to point to the byte following the last byte success- fully used in the conversion. The value pointed to by inbyesleft is reduced to reflect the number of bytes still not converted in the input buffer. The variable pointed to by outbuf is updated to point to the byte following the last byte of converted output data. The value pointed to by outbytesleft is reduced to reflect the number of bytes still available in the output buffer. If encounters a character in the input buffer that is legal but for which an identical character does not exist in the target codeset, maps this character to a pre-defined character, called the "galley character" that is defined at the time of table generation. (See genxlt(1)). Deallocates the conversion descriptor cd and all other associated resources allocated by APPLICATION USAGE
Portable applications must assume that conversion descriptors are not valid after calls to any of the functions. Special Usage In state-dependent encodings, the characters are interpreted depending on "state" of the input. State shifts occur when a specific sequence of bytes are seen in the input. These sequences will change the way subsequent characters are interpreted (that is, initially the characters may be single-byte characters, after a state shift, subsequent characters may be interpreted as two-byte characters). For state-dependent encodings, the conversion descriptor after is in a codeset-dependent initial shift state, ready for immediate use with For state-dependent encodings, the conversion descriptor cd is placed into its initial shift state by a call to for which the inbuf is a null pointer, or for which inbuf points to a null pointer. When is called in this way, and outbuf is not a null pointer or a pointer to a null pointer, and outbytesleft points to a positive value, places the byte sequence to change the output buffer to its initial shift state. If the output buffer is not large enough to hold the entire reset sequence, fails and sets to Subsequent calls with inbuf set to other than a null pointer or a pointer to a null pointer cause the conversion to take place from the current state of the conversion descriptor. For state-dependent encodings, the conversion descriptor is updated to reflect the shift state in effect at the end of the last success- fully converted byte sequence. RETURN VALUE
Upon successful completion, returns a conversion descriptor for use on subsequent calls to Otherwise returns and sets to indicate the error. updates the variables pointed to by the arguments to reflect the extent of conversion, and returns the number of non-identical conversions performed. If the entire string in the input buffer is converted, the value pointed to by inbytesleft is zero. If an error occurs, returns and sets to indicate the error. Upon successful completion, returns a value of zero. Otherwise it returns -1 and sets to indicate the error. ERRORS
fails if any of the following conditions are encountered: Insufficient storage space is available. The conversion specified by the fromcode and tocode is not supported, or the table or method specified in the configuration file could not be read or loaded correctly. This error will also occur if the configuration file itself is faulty. fails if any of the following conditions are encountered: Input conversion stopped due to an input character that does not belong to the input codeset, or if the conversion table does not contain an entry corresponding to this input character and a galley character was not defined for that particular table. Input conversion stopped due to lack of space in the output buffer. Input conversion stopped due to an incomplete character or shift sequence at the end of the input buffer. The cd argument is not a valid open conversion descriptor. fails if any of the following conditions are encountered: The conversion descriptor is invalid. EXAMPLES
The following example shows how the interfaces maybe used for conversions. #include <iconv.h> #include <errno.h> main() { ... convert("roman8", "iso88591", fd); ... } int convert(tocode, fromcode, Input) char *tocode; /* tocode name */ char *fromcode /* fromcode name */ int Input; /* input file descriptor */ { extern void error(); /* local error message */ iconv_t cd; /* conversion descriptor */ unsigned char *table; /* ptr to translation table */ int bytesread; /* num bytes read into input buffer */ unsigned char inbuf[BUFSIZ]; /* input buffer */ unsigned char *inchar; /* ptr to input character */ size_t inbytesleft; /* num bytes left in input buffer */ unsigned char outbuf[BUFSIZ]; /* output buffer */ unsigned char *outchar; /* ptr to output character */ size_t outbytesleft; /* num bytes left in output buffer */ size_t ret_val; /* number of conversions */ /* Initiate conversion -- get conversion descriptor */ if ((cd = iconv_open(tocode, fromcode)) == (iconv_t)-1) { error(FATAL, BAD_OPEN); } inbytesleft = 0; /* no. of bytes converted */ /* translate the characters */ for ( ;; ) { /* * if any bytes are leftover, they will be in the * beginning of the buffer on the next read(). */ inchar = inbuf; /* points to input buffer */ outchar = outbuf; /* points to output buffer */ outbytesleft = BUFSIZ; /* no of bytes to be converted */ if ((bytesread = read(Input, inbuf+inbytesleft, (size_t)BUFSIZ-inbytesleft)) < 0) { perror("prog"); return BAD; } if (!(inbytesleft += bytesread)) { break; /* end of conversions */ } ret_val = iconv(cd, &inchar, &inbytesleft, &outchar, &outbytesleft); if (write(1, outbuf, (size_t)BUFSIZ-outbytesleft) < 0) { perror("prog"); return BAD; } /* iconv() returns the number of non-identical conversions * performed. If the entire string in the input buffer is * converted, the value pointed to by inbytesleft will be * zero. If the conversion stopped due to any reason, the * value pointed to by inbytesleft will be non-zero and * errno is set to indicate the condition. */ if ((ret_val == -1) && (errno == EINVAL)) { /* Input conversion stopped due to an incomplete * character or shift sequence at the end of the * input buffer. */ /* Copy data left, to the start of buffer */ memcpy((char *)inbuf, (char *)inchar, (size_t)inbytesleft); } else if ((ret_val == -1) && (errno == EILSEQ)) { /* Input conversion stopped due to an input byte * that does not belong to the input codeset. */ error(FATAL, BAD_CONVERSION); } else if ((ret_val == -1) && (errno == E2BIG)) { /* Input conversion stopped due to lack of space * in the output buffer. inbytesleft has the * number of bytes to be converted. */ memcpy((char *)inbuf, (char *)inchar, (size_t)inbytesleft); } /* Go back and read from the input file. */ } /* end conversion & get rid of the conversion table */ if (iconv_close(cd) == BAD) { error(FATAL, BAD_CLOSE); } return GOOD; } WARNINGS
If you use and compile/link your application archive on PA-RISC systems, note that has a dependency on that will require a change to the compile/link command: Compile : Or compile with and The option is positionally dependent and should occur at the beginning of the compile line. For optimum compatibility in future releases, you should avoid using archive libc with other shared libraries except for libdld.sl as needed above. There is a corner-case situation for multi-byte characters that is not correctly handled by If the last character in the file being con- verted is an invalid multi-byte character, returns instead of The application can get around this by checking whether EOF is reached or if this is the last buffer being converted. In this case, should be treated as AUTHOR
was developed by HP. FILES
System configuration file containing codeset names supported by the operating system. User customizable configuration file containing additional codeset names. Directory containing tables used for conversion. Directory containing methods used for conversion. SEE ALSO
genxlt(1), iconv(1), thread_safety(5). STANDARDS CONFORMANCE
iconv(3C)
Man Page