The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Advanced & Expert Users
Google UNIX.COM


UNIX for Advanced & Expert Users Advanced UNIX and Linux questions go here. Expert-to-Expert.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
How to replace any char with newline char. mightysam Shell Programming and Scripting 5 09-18-2008 05:15 PM
Please Help me with this ..High Priority! balaji_gopal Shell Programming and Scripting 0 05-28-2008 12:14 PM
How to display first 7 char of grep results? kthatch UNIX for Dummies Questions & Answers 8 04-04-2007 10:00 PM
Sun: High kernel usage & very high load averages lorrainenineill UNIX for Advanced & Expert Users 4 02-06-2006 09:32 AM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 02-09-2008
Registered User
 

Join Date: Feb 2008
Posts: 6
Thumbs up grep high bit char

Hi -

I have file which contains high bit unicode chars like © etc.. How can I do grep to find out lines which contain copyright symbol ©

I tried using

grep \x{00A9}
grep \x\{00A9\}

Thanks-

Last edited by ayyo1234; 02-09-2008 at 09:22 AM.
Reply With Quote
Forum Sponsor
  #2  
Old 02-09-2008
Registered User
 

Join Date: Feb 2008
Posts: 6
Any suggestion

Any suggestion ?

I need to use grep only..
Reply With Quote
  #3  
Old 02-09-2008
Registered User
 

Join Date: May 2005
Posts: 64
Try this:

grep '©' filename
Reply With Quote
  #4  
Old 02-09-2008
Registered User
 

Join Date: Feb 2008
Posts: 6
How you will type '©' in unix ??? I am not sure whether you can type it in unix...

In windows I can type it using 'Alt+0169'..
Reply With Quote
  #5  
Old 02-10-2008
Registered User
 

Join Date: May 2005
Posts: 64
Quote:
Originally Posted by ayyo1234 View Post
How you will type '©' in unix ??? I am not sure whether you can type it in unix...

In windows I can type it using 'Alt+0169'..

'©' in Unix is:

Press Shift+Alt+0 simultaneously.
Reply With Quote
  #6  
Old 02-11-2008
Registered User
 

Join Date: Feb 2008
Posts: 6
Thanks for your reply.

However, I am not able to type © in unix

I tried shift+alt+0...
Reply With Quote
  #7  
Old 02-11-2008
...@...
 

Join Date: Feb 2004
Location: NM
Posts: 4,298
POSIX grep does not look past a nul character. 00A9 is the unicode sequence number for what you want. The first byte is 00 - the nul character.

grep will not do what you need. Cosnider wiritng something in C - reads in short integers (2 byte integers) from the file. Compare each one with 169. When you find 169 that is character offset in the file where the symbol is.

You are probably better off using a Windows editor.

Found a version og grep from mkssoftware that claims to support unicode:
grep, egrep, fgrep -- match patterns in a file

GNU grep has a -U switch to support binary character files (UTF-16, unicode, etc)

Last edited by jim mcnamara; 02-11-2008 at 01:07 PM.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 09:07 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0