![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Rules & FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Can i read a file character by character | karnan | Shell Programming and Scripting | 6 | 05-18-2008 11:22 PM |
| Dont want to change the codepage of a unicode file | shibajighosh | AIX | 0 | 05-12-2008 09:21 PM |
| Can I read a file character by character? | murtaza | Shell Programming and Scripting | 3 | 03-29-2007 08:19 AM |
| grep and UNICODE (utf-16) file | Whiterock | UNIX for Dummies Questions & Answers | 4 | 01-03-2007 03:22 PM |
| How to display unicode characters / unicode string | jackdorso | High Level Programming | 3 | 05-20-2005 10:09 AM |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Find Unicode Character in File
I have a very large file in Unix that I would like to search for all instances of the unicode character 0x17. I need to remove these characters because the character is causing my SAX Parser to throw an exception. Does anyone know how to find a unicode character in a file?
Thank you for your assistance. |
| Forum Sponsor | ||
|
|
|
|||
|
"0x17" is not a Unicode (UTF-16 or UTF-32) character per se.
For those not familiar with Unicode, UTF-16 basically means that every "character" is stored as 2 bytes whereas UTF-32 means every "character" is stored as 4 bytes. On a practical level, it means that most standard ASCII characters are either preceded by or followed by either a single NUL (0x00) or 3 NULs depending on whether data storage is Big-Endian or Little-Endian. Which Unicode "format" is your file using? |
|||
| Google UNIX.COM |