Grep a file that may contain strange characters


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep a file that may contain strange characters
# 1  
Old 01-07-2012
Power Grep a file that may contain strange characters

Hello unix users Smilie

I am trying to grep a string from a file that both the file and the string may have characters in them that are quite... strange, like würzburger.

Well, bash reads this as
Code:
W%C3%BCrzburger

For example, if i do
Code:
wget W%C3%BCrzburger

the output is:
Code:
--2012-01-08 02:54:29--  http://w%C3%BCrzburger/
Resolving würzburger... failed: Name or service not known.
wget: unable to resolve host address `würzburger'

On the other hand, if a file has this word inside it, and I do
Code:
cat file | grep W%C3%BCrzburger

i get no matching :/

Why is this? how can this be solved?
Many thanks Smilie
# 2  
Old 01-07-2012
Code:
grep w$'\xC3\xBC'rzburger /tmp/2

würzburger

cat /tmp/2:

wcrzburger
würzburger
wcrzburger
This User Gave Thanks to dude2cool For This Post:
# 3  
Old 01-08-2012
Tools

Quote:
Originally Posted by dude2cool
Code:
grep w$'\xC3\xBC'rzburger /tmp/2

würzburger

cat /tmp/2:

wcrzburger
würzburger
wcrzburger
Thanks, nice solution, it works as expected. The problem is that this grepping is part of a large script file, so, is there anything I can do so as to detect if the term I want to search contains not very usual letters like the above?

And if yes, then is there any way with which I can replace the %C3s etc with \xC3s so as to search the term, without possibly altering the rest of the string?

That seems a bit difficult :O
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

Strange characters in FORTRAN code output

Hi guys, After compiling a .f90 code and executing it, i get strange characters in the output file like : ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ Are these windows characters? how can i get rid of this? Much appreciated. Paul (1 Reply)
Discussion started by: Paul Moghadam
1 Replies

2. Hardware

Strange Characters from ILOM

Hello, I have an x86 server with an ILOM connection that produces strange characters when I perform a start /SP/console, see below: Oracle(R) Integrated Lights Out Manager Version 3.0.16.10.a r68533 Copyright (c) 2011, Oracle and/or its affiliates. All rights reserved. -> start... (9 Replies)
Discussion started by: kerrygold
9 Replies

3. Red Hat

Spanish Characters get converted in strange chrac

I am trying to sftp a textfile from windows to linux. The file includes some spanish characters. When I vi the file in LINUX, the special (spanish) characters get converted into some strange characters. anyone know how i can resolve this? for example México gets converted into México on LINUX. (0 Replies)
Discussion started by: mrx1350
0 Replies

4. Shell Programming and Scripting

Is there anyway to grep any special characters from a file ?

Is there any command or shell script to grep any special character from a file ? I have a huge file containing millions of user names; the requirement is to find names containing special characters. #!/bin/bash for i in `cat username.txt` do #COMMAND to grep special character done ... (3 Replies)
Discussion started by: poga
3 Replies

5. UNIX for Dummies Questions & Answers

Strange error: grep does not read from file

I have two files file1.txt angie mary susan file2.txt angie blond mary brunnet susan red christine blackI want to get this output angie blond mary brunnet susan redI write grep --file=file1.txt file2.txtand i get no results i also wrote cat file1.txt|while read line... (19 Replies)
Discussion started by: FelipeAd
19 Replies

6. Shell Programming and Scripting

awk filelist containing strange characters

I've written a script: find -depth | awk ‘ { if ( substr($1,length($0)-2,3) == “/1.” ) { print $1 } { system(“awk -f test1.awk “ $1 ) } } ‘ The idea is that it trundles through a large directory structure looking for files which are named '1.' and then... (3 Replies)
Discussion started by: nashcom
3 Replies

7. UNIX for Dummies Questions & Answers

how to grep junk characters in a file

hi guys, I am generating a file from datastage (an etl tool). Now the file is having some junk characters like ( Á,L´±,ñ and so on).. I want to use the grep function to figure out all the junk characters and their location. Can somebody help me out in finding it out.. if possible i... (1 Reply)
Discussion started by: mac4rfree
1 Replies

8. Shell Programming and Scripting

Strange Characters After Using Notepad

Hello all, I'm new to UNIX and new to this forum, so forgive my lack of knowledge. I'm new with editing in vi so I FTP scripts to a Windows machine and edit the script in notepad (when I need to do something quickly). Sometimes when I FTP the script back to the UNIX box, strange characters... (4 Replies)
Discussion started by: dgower2
4 Replies

9. Shell Programming and Scripting

Lines with strange characters and sed...

Dear All: I Have a bunch of files which I'd like to process with a shell script. The problem is that the files have strange characters in their headers, like �g�8@L-000-MSG2__-ABCD________-FIRA_____-000001___-200806181330-__ ��e� Data from BLABLABLA, Instrument: BLABLA, Date:... (4 Replies)
Discussion started by: luiscarvalheiro
4 Replies

10. UNIX for Dummies Questions & Answers

Strange Characters in Filename

Hi folks. None of the conventional methods are working for my dilemma: I have a file in my root directory that has a name comprised of strange characters. When I do an ls, it just hangs at that file until I do a Cntrl-C. rm ./filename & rm \filename do not work. I am entering the... (4 Replies)
Discussion started by: kristy
4 Replies
Login or Register to Ask a Question