Dont want to change the codepage of a unicode file

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Find file dont have that string

I have 13 text files and almost all of them contain the same string. but some file has diffrent string inside. I want to send that file which has a diffrent string inside

2. SuSE

FTP set codepage of source dataset

I have a file on Suse Linux which is created in codepage 420. I need to FTP this file to a remote server with codepage 1256. How do I specify in FTP that the source codepage is 420 instead of the Linux default? Is this possible with curl or any other ftp client on Linux. I don't see a...

3. Shell Programming and Scripting

Reading/Viewing an Unicode file

WE have a file coming from a server that has characters for 4-5 languages. If I download the file to my windows PC and open in Notepad ++, I can clearly see the text in different languages. Notepad++ is able to reder text that is in Portugese, French, Thai etc. My objective it to do the following:...

4. Shell Programming and Scripting

dont want to ftp file which has been already ftpied

curr_time=`date +%Y%m%d%H%M` curr_date=`date +%Y%m%d` zero=0 script_path="/home/wccuser1/wcc/Scripts/bulk_file_ftp" file_dir="/home/wccuser1/wcc/Bulk_Files" todays_file_count=`ls -ltr | grep $curr_date | awk '{print $9}' | wc -l` todays_file=`ls -ltr | grep $curr_date | awk '{print $9}'`...

5. Shell Programming and Scripting

Unicode file validation

I don't want HTML_CONTENT,RICH_CONTENT,TEXT_CONTENT columns data in the file and reset of data we need to extract. Find the attached file. Need to extract date in between DI_UX_ROW_END tag. Can help me using unix command using AWK. Thanks,

6. UNIX for Dummies Questions & Answers

How to find a file if we dont know exact location of file ?

Hi I want know "How to find a file if we dont know exact location of file ?" Thanks, Tushar Joshi:)

7. Shell Programming and Scripting

sed replacement in unicode file

Hi there, I have a file generated by a windows registry (it's unicode) and can't get to do some replacements on it. I want to join lines that end with backslash with the next one. santiago@ks354286:~$ cat win.reg ��Windows Registry Editor Version 5.00 ...

8. Shell Programming and Scripting

Find Unicode Character in File

I have a very large file in Unix that I would like to search for all instances of the unicode character 0x17. I need to remove these characters because the character is causing my SAX Parser to throw an exception. Does anyone know how to find a unicode character in a file? Thank you for your...

9. UNIX for Dummies Questions & Answers

mv command if you dont know the full name of the file

Hi all. I have a problem renaming a file. It is called "data_200711051002.csv" (for example) and I want to rename it automatically to "data.csv" in a routine in a program named Ascential. Those numbers are a time stamp, so every time the file is loaded into the server, they will change. ...

10. Programming

How to display unicode characters / unicode string

I have a stream of characters like "\u8BBE\u5907\u7BA1" and i want to display it. I tried following things already without any luck. 1) printf("%s",L("\u8BBE\u5907\u7BA1")); 2) printf("%lc",0x8BBE); 3) setlocale followed by fwide followed by wprintf 4) also changed the local manually...

LEARN ABOUT SUSE

unicode

unicode(n)						       Unicode normalization							unicode(n)

__________________________________________________________________________________________________________________________________________________

NAME

       unicode - Implementation of Unicode normalization

SYNOPSIS

       package require Tcl  8.3

       package require unicode	1.0

       ::unicode::fromstring string

       ::unicode::tostring uclist

       ::unicode::normalize form uclist

       ::unicode::normalizeS form string

_________________________________________________________________

DESCRIPTION

       This is an implementation in Tcl of the Unicode normalization forms.

COMMANDS

       ::unicode::fromstring string
	      Converts string to list of integer Unicode character codes which is used in unicode for internal string representation.

       ::unicode::tostring uclist
	      Converts list of integers uclist back to Tcl string.

       ::unicode::normalize form uclist
	      Normalizes  Unicode  characters  list  ulist according to form and returns the normalized list. Form form takes one of the following
	      values: D (canonical decomposition), C (canonical decomposition, followed by canonical composition),  KD	(compatibility	decomposi-
	      tion), or KC (compatibility decomposition, followed by canonical composition).

       ::unicode::normalizeS form string
	      A  shortcut  to ::unicode::tostring [unicode::normalize $form [::unicode::fromstring $string]].  Normalizes Tcl string and returns
	      normalized string.

EXAMPLES

       % ::unicode::fromstring "u0410u0411u0412u0413"
       1040 1041 1042 1043
       % ::unicode::tostring {49 50 51 52 53}
       12345
       %

       % ::unicode::normalize D {7692 775}
       68 803 775
       % ::unicode::normalizeS KD "u1d2c"
       A
       %

REFERENCES

       [1]    "Unicode Standard Annex #15: Unicode Normalization Forms", (http://unicode.org/reports/tr15/)

AUTHORS

       Sergei Golovan

BUGS, IDEAS, FEEDBACK
       This document, and the package it describes, will undoubtedly contain bugs and other problems.  Please report such in the category  string-
       prep  of  the  Tcllib  SF Trackers [http://sourceforge.net/tracker/?group_id=12883].  Please also report any ideas for enhancements you may
       have for either package and/or documentation.

SEE ALSO

       stringprep(n)

KEYWORDS

       normalization, unicode

COPYRIGHT

       Copyright (c) 2007, Sergei Golovan <sgolovan@nes.ru>

stringprep							       1.0.0								unicode(n)

AIX