The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Can i read a file character by character karnan Shell Programming and Scripting 6 05-18-2008 11:22 PM
Dont want to change the codepage of a unicode file shibajighosh AIX 0 05-12-2008 09:21 PM
Can I read a file character by character? murtaza Shell Programming and Scripting 3 03-29-2007 08:19 AM
grep and UNICODE (utf-16) file Whiterock UNIX for Dummies Questions & Answers 4 01-03-2007 03:22 PM
How to display unicode characters / unicode string jackdorso High Level Programming 3 05-20-2005 10:09 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-11-2008
Registered User
 

Join Date: Jun 2007
Posts: 3
Find Unicode Character in File

I have a very large file in Unix that I would like to search for all instances of the unicode character 0x17. I need to remove these characters because the character is causing my SAX Parser to throw an exception. Does anyone know how to find a unicode character in a file?

Thank you for your assistance.
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 04-11-2008
Moderator
 

Join Date: Dec 2003
Location: /ksh93
Posts: 841
"0x17" is not a Unicode (UTF-16 or UTF-32) character per se.

For those not familiar with Unicode, UTF-16 basically means that
every "character" is stored as 2 bytes whereas UTF-32 means every
"character" is stored as 4 bytes.

On a practical level, it means that most standard ASCII characters are
either preceded by or followed by either a single NUL (0x00) or 3 NULs
depending on whether data storage is Big-Endian or Little-Endian.

Which Unicode "format" is your file using?
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 04:21 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0