grep and UNICODE (utf-16) file Post: 302101553

9 More Discussions You Might Find Interesting

1. Programming

How to display unicode characters / unicode string

I have a stream of characters like "\u8BBE\u5907\u7BA1" and i want to display it. I tried following things already without any luck. 1) printf("%s",L("\u8BBE\u5907\u7BA1")); 2) printf("%lc",0x8BBE); 3) setlocale followed by fwide followed by wprintf 4) also changed the local manually...

2. Shell Programming and Scripting

Help with Converting UTF-8 data to Unicode

How can I get an error when converting 3rd line, since it has invalid characters abcde a®cdée a�cd� Unicode for ® = � é = � I used "iconv -f UTF-8 -t ISO-8859-15 in.txt > out.txt"

3. Shell Programming and Scripting

Unicode file validation

I don't want HTML_CONTENT,RICH_CONTENT,TEXT_CONTENT columns data in the file and reset of data we need to extract. Find the attached file. Need to extract date in between DI_UX_ROW_END tag. Can help me using unix command using AWK. Thanks,

4. UNIX for Dummies Questions & Answers

Issue with UTF-8 BOM character in text file

Sometimes we recieve some excel files containing French/Japanese characters over the mail, and these files are manually transferred to the server by using SFTP (security is not a huge concern here). The data is changed to text format before transferring it using Notepad. Problem is: When saving...

5. UNIX for Advanced & Expert Users

[ask]unicode utf-8 for arabic font

hlow all, i want to read arabic font in cli (cat, vi ,etc) in windows i can see the for why in linux i can't see that. this for the example وَمَنْ يَشْكُرْ فَإِنَّمَا يَشْكُرُ لِنَفْسِهِ what should i do ? i need your advice for read that font in cli...:confused: thx before

6. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8...

7. Shell Programming and Scripting

Copying a file with UTF char on UNIX server

Hi, I need to run a SQL which check for special UTF char in DB. When I try to copy that in UNIX file it changes it to some wierd chat. How can in retain the UTF chars in my script? e.g. ο|π|ρ|σ|τ|υ|φ|χ|ψ Any help will be appriciated. Thanks,

8. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or...

9. Shell Programming and Scripting

Create .nfo file in ISO-8859-1 or UTF-8

Hey guys, I have a little problem, Let's say I create this script : #!/bin/sh nfo_file="/home/admin/info.nfo" echo "▒▒█ Hello █▒▒" > $nfo_fileIt seems to be okay : cat /home/admin/info.nfo ▒▒█ Hello █▒▒file -bi /home/admin/info.nfo text/plain; charset=utf-8But when I open it in a...

LEARN ABOUT PLAN9

ascii

ASCII(1)						      General Commands Manual							  ASCII(1)

NAME

       ascii, unicode - interpret ASCII, Unicode characters

SYNOPSIS

       ascii [ -8 ] [ -oxdbn ] [ -nct ] [ text ]

       unicode [ -nt ] hexmin-hexmax

       unicode [ -t ] hex [ ...  ]

       unicode [ -n ] characters

       look hex /lib/unicode

DESCRIPTION

       Ascii prints the ASCII values corresponding to characters and vice versa; under the -8 option, the ISO Latin-1 extensions (codes 0200-0377)
       are included.  The values are interpreted in a settable numeric base; -o specifies octal, -d decimal, -x hexadecimal (the default), and -bn
       base n.

       With  no  arguments, ascii prints a table of the character set in the specified base.  Characters of text are converted to their ASCII val-
       ues, one per line. If, however, the first text argument is a valid number in the specified base, conversion goes the opposite way.  Control
       characters are printed as two- or three-character mnemonics.  Other options are:

       -n     Force numeric output.

       -c     Force character output.

       -t     Convert from numbers to running text; do not interpret control characters or insert newlines.

       Unicode	is  similar; it converts between UTF and character values from the Unicode Standard (see utf(6)).  If given a range of hexadecimal
       numbers, unicode prints a table of the specified Unicode characters -- their values and UTF representations.  Otherwise it translates  from
       UTF  to numeric value or vice versa, depending on the appearance of the supplied text; the -n option forces numeric output to avoid ambigu-
       ity with numeric characters.  If converting to UTF , the characters are printed one per line unless the -t flag is set, in which  case  the
       output is a single string containing only the specified characters.  Unlike ascii, unicode treats no characters specially.

       The output of ascii and unicode may be unhelpful if the characters printed are not available in the current font.

       The  file /lib/unicode contains a table of characters and descriptions, sorted in hexadecimal order, suitable for look(1) on the lower case
       hex values of characters.

EXAMPLES

       ascii -d
	      Print the ASCII table base 10.

       unicode p
	      Print the hex value of `p'.

       unicode 2200-22f1
	      Print a table of miscellaneous mathematical symbols.

       look 039 /lib/unicode
	      See the start of the Greek alphabet's encoding in the Unicode Standard.

FILES

       /lib/unicode
	      table of characters and descriptions.

SOURCE

       /sys/src/cmd/ascii.c
       /sys/src/cmd/unicode.c

SEE ALSO

       look(1) tcs(1), utf(6), font(6),

																	  ASCII(1)