Shell script remove bad character


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell script remove bad character
# 1  
Old 10-10-2012
Shell script remove bad character

I was curious to know how to write into my shell script to remove a character. The character I want to remove is  within a .html file.
# 2  
Old 10-10-2012
Code:
sed 's/Â//g' a.html

or
cat a.html | tr -d "Â"

Last edited by aashish.sharma8; 10-10-2012 at 11:56 AM.. Reason: added tr
# 3  
Old 10-10-2012
the tr is what I need but it will not work in the script I wrote:

Code:
##Fix copyright incorrect in .html files
		for htmlfile in $(find $DIRECTORY -type f -name \*.html); do
			TFILE="/tmp/$directoryname.$$"
			
			FROM='©'
			TO='\&\#169\;'
			sed "s/$FROM/$TO/g" "$htmlfile" > $TFILE && mv $TFILE "$htmlfile"
		done

Ive tried:
Code:
TO='\&\#169\;'
tr -d 'Â'

Code:
TO='\&\#169\;'
tr -d "Â"

Code:
TO='\&\#169\;'
tr 'Â'

For some reason after the code is run it will replace the © but will add  before the html entity. Any ideas??
# 4  
Old 10-10-2012
Character set disagreements between your terminal and the file, probably.

Try this:

Code:
tr -d '[\200-\377]' < inputfile > outputfile

This should get rid of all UTF-8 extended characters.
# 5  
Old 10-10-2012
Quote:
Originally Posted by Corona688
Character set disagreements between your terminal and the file, probably.

Try this:

Code:
tr -d '[\200-\377]' < inputfile > outputfile

This should get rid of all UTF-8 extended characters.
can you explain to me what you are doing??

tr = Translate, squeeze, and/or delete characters
-d = same as --no-dereference --preserve=link

but Im lost here:
'[\200-\377]'

and here:
< inputfile > outputfile
# 6  
Old 10-10-2012
Code:
# ascii 169 = oct 251
tr -d '\251'

'[\252-\254]' # all ascii between \252 and \254
or list some chars
'\251\253\221'

\NNN = octal value

If you need conversion tool, ex. in ksh93 you can use builtin:
Code:
dec=169
typeset -i 8 oct
oct=$dec
echo $oct
# or binary
typeset -i 2 bin
bin=$dec
echo $bin


Last edited by kshji; 10-10-2012 at 12:38 PM..
# 7  
Old 10-10-2012
sorry Im still pretty new in bash. Ive tried running tr -d 'Â' but it stops in the middle of the script with no indication what is going on.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove newline character if it is the only character in the entire file.?

I have a file which comes every day and the file data look's as below. Vi abc.txt a|b|c|d\n a|g|h|j\n Some times we receive the file with only a new line character in the file like vi abc.txt \n (8 Replies)
Discussion started by: rak Kundra
8 Replies

2. Shell Programming and Scripting

ksh - Get last character from string - Bad Substitution error

I want to get the last character from my machine name using the following code, the default shell is bash, the script runs in ksh. I get 'bad' substitution error on running the script, but works fine if run using dot and space. Why? $ echo $0 bash $ cat -n myenv.sh 1 ... (8 Replies)
Discussion started by: ysrini
8 Replies

3. Shell Programming and Scripting

Bad substitution error in shell script

I have script data.sh which has following error. Script Name : data.sh #!/bin/sh infile=$1 len=${#infile} echo $len texfile=${infile:0:$len-4} echo $texfile run command ./data.sh acb.xml I get following error message: (5 Replies)
Discussion started by: man4ish
5 Replies

4. Shell Programming and Scripting

Script to search for a character in files in a Directory & remove it

Hi All, Am new to both Unix & this Forum - Need some help on a script that I am trying to write: In a Directory i have few text files which might or might not contain some text that I am trying to find. Once that text is found in any of the files, it needs to be removed from the file ... (6 Replies)
Discussion started by: rituparna_gupta
6 Replies

5. Linux

Linux script to remove a character in a file based on position.

Greetings, We have a requirement where we need to loop in a fixed width file in linux and remove a character based on a position for every record. It would highly appreciate if someone can help to automate this. Appreciate your time and help! Regards (3 Replies)
Discussion started by: mailme0205
3 Replies

6. Shell Programming and Scripting

Bad character in output file

Hi All, I am facing some problems with bad characters in my file.For example- 00000000509 TCI DEVOFFERS= 1 Now I want to remove all bad characters and replace with *. Please suggest some solution. Along with that How to mention a range of ASCII values in TR command to replace? ... (2 Replies)
Discussion started by: bghosh
2 Replies

7. Shell Programming and Scripting

Bad substitution errors in shell script

Hello, I was hoping for a second pair of eyes or a little bit of help figuring out what my error is in a script. I did some searching in the forums and didn't find anything so please forgive me if it a similar problem has been discussed before. My script accepts normal user arguments; however,... (2 Replies)
Discussion started by: Jackinthemox
2 Replies

8. Shell Programming and Scripting

shell script to remove the last character(.) of a string

hi I have a list of words in a text file. these words are appended by "." at their end. They look something like this. word1. word2. word3. word4. word5. I need to remove the last character "." from all the words. The output must look something like this. word1 word2 word3... (7 Replies)
Discussion started by: ss3944
7 Replies

9. Shell Programming and Scripting

Script to remove first character if it is zero

Hi All, I have a input like this. 01 i want the output like this. 1 But if the input file is like 11 i should not do anything. Can anyone please help to get a command to do this. thanks, Giri. ---------- Post updated at 02:11 AM ---------- Previous update was at 02:04 AM ----------... (5 Replies)
Discussion started by: girish.raos
5 Replies

10. Shell Programming and Scripting

Script to Remove Garbage Character

Hello, Whenever I transfer files between machines, I find a garbage character (^M) being appended to the end of every line of the file. Can you suggest a script wherein I can eliminate the garbage character. I tried sed 's/^M//g' < filename > filename1 ...but it doesn't work. Also, this... (4 Replies)
Discussion started by: Eddie_The_Head
4 Replies
Login or Register to Ask a Question