Sponsored Content
Top Forums UNIX for Advanced & Expert Users Convert UTF-8 encoded hex value to a character Post 302252019 by fpmurphy on Tuesday 28th of October 2008 03:41:01 PM
Old 10-28-2008
If your locale is set up correctly any number of utilities can display this character correctly. For example if your shell is ksh93 version s or better, printf "\xC5B4" will output the expected character.
 

10 More Discussions You Might Find Interesting

1. Programming

Howto convert Ascii -> UTF-8 & back C++

While working with russian text under FreeBSD&MySQL I need to convert a string from MySQL to the Unicode format. I've just started my way in C++ under FreeBSD , so please explain me how can I get ascii code of Char variable and also how can i get a character into variable with the specified ascii... (3 Replies)
Discussion started by: macron
3 Replies

2. Red Hat

Can't convert 7bit ASCII to UTF-8

Hello, I am trying to convert a 7bit ASCII file to UTF-8. I have used iconv before though it can't recognize it for some reason and says unknown file encoding. When I used ascii2uni package with different package, ./ascii2uni -a K -a I -a J -a X test_file > new_test_file It still... (2 Replies)
Discussion started by: rockf1bull
2 Replies

3. Shell Programming and Scripting

How to modify character to UTF-8 in shell script?

I have a shell script running to load some data from a text file to database. Text file contains some non-ASCII characters like ü. How can i convert these characters to UTF-8 codes before loading to DB. (5 Replies)
Discussion started by: vel4ever
5 Replies

4. Shell Programming and Scripting

Convert hex to decimal

can someone help me in converting hex streams to decimal values using perl script Hex value: $my_hex_stream="0c07ac14001676"; Every hex value in the above stream should be converted in to decimal and separated by comma. The output should be: 12,07,172,20,00,22,118 (2 Replies)
Discussion started by: Arun_Linux
2 Replies

5. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving... (4 Replies)
Discussion started by: jawsnnn
4 Replies

6. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

7. Shell Programming and Scripting

Trying to convert utf-8 to WINDOWS-1251

Hello all i have utf-8 file that i try to convert to WINDOWS-1251 on linux without any success the file name is utf-8 when i try to do : file -bi test.txt it gives me : text/plain; charset=utf-8 when i try to convert the file i do : /usr/bin/iconv -f UTF-8 -t WINDOWS-1251 test.txt >... (1 Reply)
Discussion started by: umen
1 Replies

8. UNIX for Advanced & Expert Users

UTF-8,16,32 character lengths using awk

Hi All, I am trying to obtain count of characters using awk, but "length" function returns a value of 1 for 2-byte or 3-byte characters as well unlike wc -c command. I have tried to use the below commands within awk function, but it does not seem to work { cmd="wc -c "stringtocheck ( cmd )... (6 Replies)
Discussion started by: tostay2003
6 Replies

9. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

10. UNIX for Beginners Questions & Answers

Convert files to UTF-8 on AIX 7.1

Dears, I have a shell script - working perfectly on Oracle Linux - that detects the encoding (the charset to be exact) of the files in a specified directory using the "file" command (The file command outputs the charset in Linux, but doesn't do that in AIX), then if the file isn't a UTF-8 text... (4 Replies)
Discussion started by: JeanM-1
4 Replies
printf(1)							   User Commands							 printf(1)

NAME
printf - write formatted output SYNOPSIS
/usr/bin/printf printf format [argument]... ksh93 printf format [string...] DESCRIPTION
/usr/bin/printf The printf command writes formatted operands to the standard output. The argument operands are formatted under control of the format oper- and. ksh93 printf writes each string operand to standard output using format to control the output format. OPERANDS
/usr/bin/printf The following operands are supported by /usr/bin/printf: format A string describing the format to use to write the remaining operands. The format operand is used as the format string described on the formats(5) manual page, with the following exceptions: o A SPACE character in the format string, in any context other than a flag of a conversion specification, is treated as an ordinary character that is copied to the output. o A character in the format string is treated as a character, not as a SPACE character. o In addition to the escape sequences described on the formats(5) manual page (\, a, , f, , , , v), ddd, where ddd is a one-, two- or three-digit octal number, is written as a byte with the numeric value specified by the octal number. o The program does not precede or follow output from the d or u conversion specifications with blank characters not specified by the format operand. o The program does not precede output from the o conversion specification with zeros not specified by the format oper- and. o An additional conversion character, b, is supported as follows. The argument is taken to be a string that can con- tain backslash-escape sequences. The following backslash-escape sequences are supported: o the escape sequences listed on the formats(5) manual page (\, a, , f, , , , v), which are converted to the characters they represent o ddd, where ddd is a zero-, one-, two- or three-digit octal number that is converted to a byte with the numeric value specified by the octal number o c, which is written and causes printf to ignore any remaining characters in the string operand containing it, any remaining string operands and any additional characters in the format operand. The interpretation of a backslash followed by any other sequence of characters is unspecified. Bytes from the converted string are written until the end of the string or the number of bytes indicated by the precision spec- ification is reached. If the precision is omitted, it is taken to be infinite, so all bytes up to the end of the converted string are written. For each specification that consumes an argument, the next argument operand is evaluated and converted to the appropriate type for the conversion as specified below. The format operand is reused as often as necessary to satisfy the argument operands. Any extra c or s conversion specifications are evaluated as if a null string argument were supplied; other extra conversion specifications are evaluated as if a zero argument were supplied. If the format operand contains no conversion specifications and argument operands are present, the results are unspecified. If a character sequence in the format operand begins with a % character, but does not form a valid conversion specification, the behavior is unspecified. argument The strings to be written to standard output, under the control of format. The argument operands are treated as strings if the corresponding conversion character is b, c or s. Otherwise, it is evaluated as a C constant, as described by the ISO C stan- dard, with the following extensions: o A leading plus or minus sign is allowed. o If the leading character is a single- or double-quote, the value is the numeric value in the underlying codeset of the character following the single- or double-quote. If an argument operand cannot be completely converted into an internal value appropriate to the corresponding conversion speci- fication, a diagnostic message is written to standard error and the utility does not exit with a zero exit status, but contin- ues processing any remaining operands and writes the value accumulated at the time the error was detected to standard output. ksh93 The format operands support the full range of ANSI C formatting specifiers as well as additional specifiers. The following additional specifiers are supported. %b Processes each character in the string operand specially, as follows: a Alert character.  Backspace character. c Terminate output without appending NEWLINE. The remaining string operands are ignored. E Escape character (ASCII octal 033). f FORM FEED character. NEWLINE character. TAB character. v Vertical tab character. \ Backslash character. x The 8-bit character whose ASCII code is the 1-, 2-, or 3-digit octal number x. %B Treat the argument as a variable name and output the value without converting it to a string. This is most useful for variables of type -b. %H Output string with characters <, &, >, ", and non-printable characters, properly escaped for use in HTML and XML documents. %P Treat string as an extended regular expression and convert it to a shell pattern. %q Output string quoted in a manner that it can be read in by the shell to get back the same string. However, empty strings resulting from missing string operands are not quoted. %R Treat string as an shell pattern expression and convert it to an extended regular expression. %T Treat string as a date/time string and format it. The T can be preceded by (dformat), where dformat is a date format as defined by the date(1) command. %Z Output a byte whose value is 0. When performing conversions of string to satisfy a numeric format specifier, if the first character of string is "or', the value is the numeric value in the underlying code set of the character following the "or'. Otherwise, string is treated like a shell arithmetic expres- sion and evaluated. If a string operand cannot be completely converted into a value appropriate for that format specifier, an error occurs, but remaining string operands continue to be processed. In addition to the format specifier extensions, the following extensions of ANSI-C are permitted in format specifiers: o The escape sequences E and e expand to the escape character which is octal 033 in ASCII. o The escape sequence cx expands to CTRL-x. o The escape sequence C[.name.] expands to the collating element name. o The escape sequence x{hex}expands to the character corresponding to the hexadecimal value hex. o The format modifier flag = can be used to center a field to a specified width. When the output is a terminal, the character width is used rather than the number of bytes. o Each of the integral format specifiers can have a third modifier after width and precision that specifies the base of the con- version from 2 to 64. In this case, the # modifier causes base# to be prepended to the value. o The # modifier can be used with the d specifier when no base is specified to cause the output to be written in units of 1000 with a suffix of one of k M G T P E. o The # modifier can be used with the i specifier to cause the output to be written in units of 1024 with a suffix of one of Ki Mi Gi Ti Pi Ei. If there are more string operands than format specifiers, the format string is reprocessed from the beginning. If there are fewer string operands than format specifiers, then string specifiers are treated as if empty strings were supplied, numeric conversions are treated as if 0 was supplied, and time conversions are treated as if now was supplied. printf is equivalent to print -f, which allows additional options to be specified. USAGE
/usr/bin/printf The printf utility, like the printf(3C) function on which it is based, makes no special provision for dealing with multi-byte characters when using the %c conversion specification or when a precision is specified in a %b or %s conversion specification. Applications should be extremely cautious using either of these features when there are multi-byte characters in the character set. Field widths and precisions cannot be specified as *. For compatibility with previous versions of SunOS 5.x, the $ format specifier is supported for formats containing only %s specifiers. The %b conversion specification is not part of the ISO C standard; it has been added here as a portable way to process backslash escapes expanded in string operands as provided by the echo utility. See also the USAGE section of the echo(1) manual page for ways to use printf as a replacement for all of the traditional versions of the echo utility. If an argument cannot be parsed correctly for the corresponding conversion specification, the printf utility reports an error. Thus, over- flow and extraneous characters at the end of an argument being used for a numeric conversion are to be reported as errors. It is not considered an error if an argument operand is not completely used for a c or s conversion or if a string operand's first or sec- ond character is used to get the numeric value of a character. EXAMPLES
/usr/bin/printf Example 1 Printing a Series of Prompts The following example alerts the user, then prints and reads a series of prompts: example% printf "aPlease fill in the following: Name: " read name printf "Phone number: " read phone Example 2 Printing a Table of Calculations The following example prints a table of calculations. It reads out a list of right and wrong answers from a file, calculates the percentage correctly, and prints them out. The numbers are right-justified and separated by a single tab character. The percentage is written to one decimal place of accuracy: example% while read right wrong ; do percent=$(echo "scale=1;($right*100)/($right+$wrong)" | bc) printf "%2d right %2d wrong (%s%%) " $right $wrong $percent done < database_file Example 3 Printing number strings The command: example% printf "%5d%4d " 1 21 321 4321 54321 produces: 1 21 3214321 54321 0 The format operand is used three times to print all of the given strings and that a 0 was supplied by printf to satisfy the last %4d con- version specification. Example 4 Tabulating Conversion Errors The following example tabulates conversion errors. The printf utility tells the user when conversion errors are detected while producing numeric output. These results would be expected on an implementation with 32-bit twos-complement integers when %d is specified as the format operand: +-----------------------------------------------------------------------+ | Arguments Standard Diagnostic | |5a 5 printf: 5a not completely converted | |9999999999 2147483647 printf: 9999999999: Results too large | |-9999999999 -2147483648 printf: -9999999999: Results too large | |ABC 0 printf: ABC expected numeric value | +-----------------------------------------------------------------------+ The value shown on standard output is what would be expected as the return value from the function strtol(3C). A similar correspondence exists between %u and strtoul(3C), and %e, %f and %g and strtod(3C). Example 5 Printing Output for a Specific Locale The following example prints output for a specific locale. In a locale using the ISO/IEC 646:1991 standard as the underlying codeset, the command: example% printf "%d " 3 +3 -3 '3 "+3 "'-3" produces: +-----------------------------------------------------------+ |3 Numeric value of constant 3 | |3 Numeric value of constant 3 | |-3 Numeric value of constant -3 | |51 Numeric value of the character `3' in the ISO/IEC | | 646:1991 standard codeset | |43 Numeric value of the character `+' in the ISO/IEC | | 646:1991 standard codeset | |45 Numeric value of the character `-' in the SO/IEC | | 646:1991 standard codeset | +-----------------------------------------------------------+ In a locale with multi-byte characters, the value of a character is intended to be the value of the equivalent of the wchar_t representa- tion of the character. If an argument operand cannot be completely converted into an internal value appropriate to the corresponding conversion specification, a diagnostic message is written to standard error and the utility does exit with a zero exit status, but continues processing any remaining operands and writes the value accumulated at the time the error was detected to standard output. ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of printf: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, LC_NUMERIC, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: /usr/bin/printf +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWloc | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Committed | +-----------------------------+-----------------------------+ |Standard |See standards(5). | +-----------------------------+-----------------------------+ ksh93 +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |Interface Stability |Uncommitted | +-----------------------------+-----------------------------+ SEE ALSO
awk(1), bc(1), date(1), echo(1), ksh93(1), printf(3C), strtod(3C), strtol(3C), strtoul(3C), attributes(5), environ(5), formats(5), stan- dards(5) SunOS 5.11 1 Nov 2007 printf(1)
All times are GMT -4. The time now is 09:39 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy