Print byte position of extended ascii character Post: 303020173

10 More Discussions You Might Find Interesting

1. Programming

Extended ascii

Hi all, I would like to change the extended ascii code ( 128 - 255). I tried to change LC_ALL and LANG in current session ( values from locale -a) and for no good. Thanks.

2. Shell Programming and Scripting

extended ascii problem

hi i would like to check text files if they contain extended ascii characters within or not. i really dont have any idea how to start your kind help would be very much appreciated thanks.

3. UNIX for Advanced & Expert Users

Processing extended ascii character file names in UNIX (BASH scipts)

Hi, I have a accentuated letter (�) in a script for an Installer. It's a file name. This is not working and I'm told to try using the octal value for the extended ascii character. Does anyone no how to do this? If I had the word "filf�rval", can I just put in the value between the letters, like...

4. Shell Programming and Scripting

Print lines with specific character at nth position in a file

I need to print lines with character S at nth position in a file...can someone pl help me with appropriate awk command for this

5. Shell Programming and Scripting

Print lines with specific character at nth position in a file

I need to print lines with character S at nth position in a file...can someone pl help me with appropriate awk command for this

6. AIX

Printing extended ASCII

Hi All, I'm trying to send extended ascii characters to my HP2055 as part of PCL printer control codes. What I want to do is select a bar code font, print the bar code and reset the printer to the default font. Selecting the bar code font works good. Printing the bar code goes almost ok too. ...

7. Shell Programming and Scripting

Print the next ASCII character

Hi, In my file, for few field I have to print the next ASCII character for every character. In the below file, I have to do for the 2,3 and 5th fields. Input File ======== 1|abc|def|5|ghi 2|jkl|mno|6|pqr Expected Ouput file ======= 1|bcd|efg|5|hij 2|klm|nop|6|qrs

8. Shell Programming and Scripting

Search and Replace Extended Ascii Characters

We are getting extended Ascii characters in the input file and my requirement is to search and replace them with a space. I am using the following command LANG=C sed -e 's// /g' It is doing a good job, but in some cases it is replacing the extended characters with two spaces. So my input...

9. Programming

How to read extended ASCII characters from stdin?

Hi, I want to read extended ASCII characters from keyboard using c language on unix/linux. How to read extended characters from keyboard or by copy-paste in terminal irrespective of locale set in the system. I want to read the input characters from keyboard, store it in an array or some local...

10. Shell Programming and Scripting

Removal Extended ASCII using awk

Hi All, I am trying to remove (SELECTIVE - passed as argument) Extended ASCII using Awk based on adhoc basis. Can you please let me know how to do it. I have to implement this using awk only. Thanks & Regads

LEARN ABOUT OSF1

iconv_intro

iconv_intro(5)							File Formats Manual						    iconv_intro(5)

NAME

       iconv_intro, iconv - Introduction to codeset conversion

DESCRIPTION

       Conversion  of  character  encoding from one coded character set (codeset) to another is an operation that often has to be performed by the
       operating system and some applications. For example, the man command supports codeset conversion to allow one set of reference  page  files
       to meet the needs of locales that support the same language and territory but different codesets (see man(1)).

       The  following  commands  and  library interfaces give users and application developers direct access to codeset conversion operations: The
       iconv command converts characters in a data file from one codeset to another (see iconv(1)).  The iconv(), iconv_open(), and  iconv_close()
       functions  convert a string of characters from one codeset to another (see iconv(3), iconv_open(3), and iconv_close(3)).  The iconv command
       uses these interfaces to convert characters.

       There are two types of codeset converters: algorithmic and table. Algorithmic converters, which reside in the /usr/lib/nls/loc/iconv direc-
       tory,  are  shared  libraries with a predefined entry point for invocation by functions in the libiconv.so library.  Algorithmic converters
       are needed for the conversion of multibyte codesets, in part because table converters cannot handle the required number of character values
       and  also because some of these codesets require complex handling (see NOTES). Algorithmic converters are supplied as part of the operating
       system product; the internal interfaces that they require are not published for external use.

       Table converters, which reside in the /usr/lib/nls/loc/iconvTable directory, can be created by using the genxlt	command  (see  genxlt(1)).
       These converters can support single-byte codesets and up to 256 encoded character values.

       Names of codeset converters are in the following form:

       from-codeset_to-codeset

       For example, the following converter converts values from Super DEC Kanji to Japanese Extended UNIX Code:

       sdeckanji_eucJP

       The codeset converters produce an invalid character error in response to characters that cannot be converted from the source codeset to the
       destination codeset. This error is always produced for character codes that are invalid in  the	source	codeset.  However,  if	the  error
       results	from  characters  that	are valid in the source codeset but have no counterparts in the destination codeset, you can eliminate the
       error by defining the ICONV_DEFSTR environment variable to specify a substitute output string. See the ENVIRONMENT  VARIABLES  section  for
       more information about using the ICONV_DEFSTR variable.

       It is possible to convert data directly between two codesets or by way of an intermediate codeset, such as UCS-2, UCS-4, or UTF-8. For con-
       version of Chinese characters, be aware that the results of converting a Traditional Chinese codeset directly to a Simplified Chinese code-
       set may not be the same as the results of converting Traditional Chinese first to UCS-2, UCS-4, or UTF-8 and then to Simplified Chinese.

ENVIRONMENT VARIABLES

       Some  codeset  converters  require more complex algorithms than can be provided through tables. The following environment variables provide
       control over conversion behavior for different kinds of codeset converters:

       Controls the behavior for the many-to-one value conversions for conversion of Traditional Chinese (except for Traditional  Chinese  encoded
       in  Telecode)  to Simplified Chinese. The valid settings for this environment variable are as follows: Specifies that the preferred mapping
       value (the first one in the one-to-many mapping list) is always taken. The batch setting is the ICONV_ACTION default.  Specifies  that  all
       the  possible  values  are printed to the standard output, enclosed by braces ({ }), so that the user can later manually edit the converted
       file and select the one to use.	Specifies that all the possible values are printed to the standard output except for punctuation  symbols,
       for which only the preferred mapping value is printed. As is true for conv-all, the conv_all_nosym setting prints value choices enclosed by
       braces so that the converted file can later be edited.  Sets byte ordering for UCS-2 or UCS-4 converters only.  Valid  values  are  little-
       endian  (the  default)  or big-endian. Setting this environment variable may be necessary when producing UCS-2 or UCS-4 output that will be
       processed by codeset converters on platforms other than Tru64 UNIX.  Defines the default string to be substituted in output for valid input
       characters  that cannot be converted from the source codeset to the destination codeset. The variable value can be an arbitrary string or a
       code number. If the value is a code number (for example, 10, 07, 0x10, or, for Unicode converters, U+1234), the corresponding character	in
       the output codeset (to-codeset) is printed.

	      For  a given type of codeset conversion, a matching ICONV_DEFSTR_from-codeset_to-codeset variable has precedence over the ICONV_DEF-
	      STR variable without the from-codeset_to-codeset suffix.	When defining the variable with the suffix, replace  from-codeset_to-code-
	      set with the name of the codeset converter to which the variable applies. The ICONV_DEFSTR variable (defined without the	suffix) is
	      used by a converter when no ICONV_DEFSTR_from-codeset_to-codeset variable has been defined specifically for the type  of	conversion
	      being done.

	      If  these  variables  are not defined or are set to the null string, the characters that cannot be converted are skipped and have no
	      representation in converted output.

	      The following converter-specific restrictions apply to ICONV_DEFSTR* variables: ICONV_DEFSTR* environment variables do not work  for
	      converters that convert between Japanese codesets or between Korean codesets.  For converters that handle UCS-2, UCS-4 or UTF-8 for-
	      mat, the only valid variable value is a code number (such as U+1234 or 0x10) or a string whose value is  a  single  ASCII  character
	      (such as ?). For these converters, any string value other than a single ASCII character is ignored and any characters that cannot be
	      converted have no representation in output.  For converters that handle output in UCS-2, UCS-4 or UTF-8 format, characters that can-
	      not  be  converted and for which no valid ICONV_DEFSTR* value has been defined produce an error condition that aborts the conversion
	      process.	Disables generation of the byte-order mark at the beginning of UCS-2 or UCS-4 output. A valid setting is any  value  other
	      than  a  null string.  By default, or if this variable is set to a null string, the byte-order mark is generated at the beginning of
	      UCS-2 or UCS-4 output.

	      Codeset converters that process UCS-2 or UCS-4 data on platforms other than Tru64 UNIX usually require the byte-order  mark.  There-
	      fore,  the current default behavior of Tru64 UNIX codeset converters produces output that is more likely to be supported as input to
	      codeset converters on other platforms.  Use the ICONV_NOBOM variable only if you need backward compatibility with output produced by
	      codeset  converters  that were included in versions of Tru64 UNIX prior to Tru64 UNIX Version 4.0D.  Activates phrase conversion for
	      converters that convert from a Traditional Chinese codeset (except for Traditional Chinese encoded in Telecode) to a Simplified Chi-
	      nese  codeset or the reverse. When phrase conversion is activated, a whole phrase in Traditional Chinese is converted to a different
	      phrase in Simplified Chinese or the reverse.

	      If ICONV_PHRCONV is set to mark, the converted phrases are be bracketed by [ and ] to highlight the  conversion  result  for  visual
	      checking.

	      The  phrase  conversion  databases  in the /usr/share/phrdb directory are normal text files with the same file names as those of the
	      algorithmic converters in /usr/lib/nls/loc/iconv/*.  These phrase conversion databases contain entries for phrase conversion pairs.

FILES

       Algorithmic converters Table converters Phrase conversion databases

SEE ALSO

       Commands: genxlt(1), iconv(1), phrase(1)

       Functions: iconv(3), iconv_close(3), iconv_open(3)

       Others: i18n_intro(5), l10n_intro(5)

																    iconv_intro(5)

10 More Discussions You Might Find Interesting

1. Programming

Extended ascii

Discussion started by: avis

2. Shell Programming and Scripting

extended ascii problem

Discussion started by: smooth

3. UNIX for Advanced & Expert Users

Processing extended ascii character file names in UNIX (BASH scipts)

Discussion started by: peli

4. Shell Programming and Scripting

Print lines with specific character at nth position in a file

Discussion started by: manaswinig

5. Shell Programming and Scripting

Print lines with specific character at nth position in a file

Discussion started by: manaswinig

6. AIX

Printing extended ASCII

Discussion started by: petervg

7. Shell Programming and Scripting

Print the next ASCII character

Discussion started by: machomaddy

8. Shell Programming and Scripting

Search and Replace Extended Ascii Characters

Discussion started by: ysvsr1

9. Programming

How to read extended ASCII characters from stdin?

Discussion started by: sanzee007

10. Shell Programming and Scripting

Removal Extended ASCII using awk

Discussion started by: tostay2003

LEARN ABOUT OSF1

iconv_intro