Need help with search & replace


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help with search & replace
# 1  
Old 08-30-2006
Need help with search & replace

I have a file that has some accent characters in it when viewed in some text editors, but when viewed in vi they come in as ~R and ~U. I need to make a script to remove these characters from the file, but have been unsuccessful. I am not sure how sed or awk, or something similar is viewing them, so I am not sure what to tell it to replace. I have tried a few different options with no luck.

File I am working with is attached, some lines that contains invalid characters are: Line 99: “Schuckís Auto Parts JOB FAIR ...”, Line 882: “ìPoint of Viewî heart of Silverdale...”.

Please let me know if you have any suggestions. Thanks.
# 2  
Old 08-30-2006
Hi,
I'mm on HP 11.11 and I can not see anything using vi?
A "cat -vetn testfile.txt | more" shows the character as M-^E and M-^R
Do a "man on cat" and read the description on "-v" talks about single bite control characters.
All we need to do is work out there ASCII code number and user a command like
# cat testfile.txt | tr -d "[ASCII CODES??]" > testfile2.txt
# 3  
Old 08-31-2006
You need to do the following......
#Remove all the "GOOD" characters first
cat testfile.txt | tr -d '[:print"]" > badfile1

#Then find the ascii coodes of the bad characters
vis badfile1 | more

(Write down the ascii codes you see as you scroll through....)
(I found these codes in your file.....)
(\205 \222 \223 \224 \225)

#Remove the "bad" characters from testfile.txt
cat testfile.txt | tr -d "[\205\222\223\224\225]" > good_testfile.txt

Done!
NOTE: 1 You better look into what is causeing these "bad" characters to be enbedded in your file
NOTE: 2 Try and find out what the above ascii codes are!?

The above ascii codes where found in this particular file and in your next file there may be the same and/or different ones to search and remove
Good luck!

Last edited by vino; 08-31-2006 at 02:29 AM.. Reason: disabled smilies
# 4  
Old 08-31-2006
One line command to find the funny characters....
#cat testfile.txt | tr -d "[Smilierint:]" | sort | vis

That way you will end up with a complete list of ascii codes that you need to remove

If I have time I'll try and work out how to just get a unique list of just the codes
# 5  
Old 08-31-2006
This is awesome! I really appreciate the help.

I have a qestion, though. I ran through the process, and running the code using the ascii codes you found removed the characters I was having trouble with. However, when I ran this code:

cat testfile.txt | tr -d '[:print"]" > badfile1

to get ascii codes myself, all I got when I ran this:

vis badfile1 | more

was these characters: (no ascii codes)

\M^R & \M^E

Am I doing something wrong? I was expecting it to show me the ascii codes. Maybe I missed a step.

Last edited by tcovert; 08-31-2006 at 01:01 PM..
# 6  
Old 08-31-2006
Nevermind. I appended the -o option to vis and it gave me the ascii codes. Thanks.

But, if you think of a way that we can only display unique ascii codes, that would be great.

Thanks again.
# 7  
Old 08-31-2006
Ok-This is it....
To get a unique list of just the bad characters listed in your file try this....

cat testfile.txt | tr -d "[Smilierint:][:space:]" | vis | fold -w 4 | sort -u

the "space" will remove the newline character so we end up with a single line of all the ascii codes. Then fold places them in a long multi line listing and sort only shows them once.

Good luck.... :-)
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search & Replace

Hi Gurus, I have two files. I want to read sessoin_name from the file1 and replace $Param4 & $Param5 in file2 with connection_name in specified in file1. The file1 will have data in following format File 1 session_name,connection_name s_abcd,Listener_2 s_def,Listener_1 source file... (7 Replies)
Discussion started by: r_t_1601
7 Replies

2. Shell Programming and Scripting

Search & Replace

Hi all Please can you help me with a script to check several files for the following string: encoding=""and replace it with: encoding="UTF-8"I did the following, : #!/bin/sh string1="encoding=""" string2="encoding="UTF-8" sed 's/'"$string1"'/'"$string2"'/g'but does not work. Please can... (18 Replies)
Discussion started by: fretagi
18 Replies

3. Shell Programming and Scripting

search & replace pattern

Hi, My problem is that I have to search a changing pattern and replace it with the wild card char "*" i/p: 99_*_YYYYMMDD_SRC.txt.tar.gz o/p: 99_*_*_SRC.txt.tar.gz The problem is that YYYYMMDD pattern is not static. It could be YYYYMMDDHHMI or could be YYYYMMDDHHMISS. Can... (10 Replies)
Discussion started by: dips_ag
10 Replies

4. Shell Programming and Scripting

Search & Replace question

Hi all, I have one question that hopefully isn't too complicated for the more advanced users here. In one of the Solaris KSH scripts I'm working on, is it possible to script the following: - If there "is" an empty blank line "at the end" of /tmp/text.txt, then remove only that one empty... (3 Replies)
Discussion started by: chatguy
3 Replies

5. Shell Programming and Scripting

Stuck with Search & Replace

Hi I'm trying to replace a string in the files ending with *.txt Unable to get the sed to do the job. any help would be appreciated :) I'm on SunOS #!/bin/bash startdirectory={$HOME}/pp_test searchterm="change" replaceterm="CHANGE" echo $searchterm echo $replaceterm for file in... (4 Replies)
Discussion started by: mqueue
4 Replies

6. UNIX for Dummies Questions & Answers

Search & Replace

Hi , I ahve a text file which has several instances of the text such as run_time: 09:30 I need to add double quotes before and after the time value i.e: run_time: "09:30" Any suggestions on how to go about the same (4 Replies)
Discussion started by: jobbyjoseph
4 Replies

7. UNIX for Dummies Questions & Answers

String Search & Replace

Hey, I want to have a C program which, for an existing file supplied by the command line argument (E.g. File1.txt) replaces all the occurrences of the words: "We” or “we” by “I” “a” by “the” “A” by “The”. Then print the replaced file. All other characters of the file are to be left... (1 Reply)
Discussion started by: IwishIknewC
1 Replies

8. Shell Programming and Scripting

search & replace in variable

Can I use search & replace in any variable? Suppose I have one variable named var1 which holds value "abcabc" I need to search 'a' in var1 and want to replace with 'x' like 'xbcxbc'. Is it possible? Can you provide me an example? Malay (3 Replies)
Discussion started by: malaymaru
3 Replies

9. Shell Programming and Scripting

Search & replace

Is there any way we can achieve search & replace with awk? I could achieve the same with sed in following way - sed 's/A/B/g' file1 > file2 But the same regex if I try with using awk following way, awk 's/A/B/g' file1 > file2 it gives me Syntax error. I strongly believe I am... (1 Reply)
Discussion started by: videsh77
1 Replies

10. UNIX for Advanced & Expert Users

vi search & replace functions

I'm trying to do a global search and replace in vi. I am trying to replace a string, call it "BOB" with a carriage return and can't seem to find a reference to it. Command syntax s%/BOB/???/g What would I substitute the "???" with? (7 Replies)
Discussion started by: barnettdk
7 Replies
Login or Register to Ask a Question