The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
help on sed replacing special characters prvnrk Shell Programming and Scripting 11 1 Week Ago 08:02 AM
Newbie: replacing strings containing special caracters drumkid UNIX for Dummies Questions & Answers 1 04-03-2006 11:39 AM
replacing string with special character ??? imppayel Shell Programming and Scripting 4 12-08-2004 01:07 AM
special characters nawnaw UNIX for Dummies Questions & Answers 2 05-18-2004 12:17 PM
awk/sed with special characters apalex Shell Programming and Scripting 5 05-06-2002 01:40 PM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 07-09-2008
Registered User
 

Join Date: Jul 2008
Posts: 5
Stumble this Post!
Replacing French special characters

Hi,

I have tonnes of .txt files that are written in French. I need to replace the French special characters, however, with English equivalents (e.g. é -> e and ç -> c).

I have tried this

---

#!/bin/bash
# Convert French characters to normal characters

# Treat each of the files

exec 3<&0
exec 0<frenchCharacters.txt

while read currentFrenchCharacter
do
read currentReplacementCharacter
sed -e "s/$currentFrenchCharacter/$currentReplacementCharacter/g" $1 > $1.frenchCharactersReplaced
mv $1.frenchCharactersReplaced $1
done

# Close the file
exec 3<&0

---

where "frenchCharacters.txt" contains a list of characters, where the first is the character to find and the second is the character to replace it with.

The problem is that it doesn't make any changes to the file that I send in (stored in $1). Anyone know why? Also, anyone know of a better way to do this?
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 07-23-2008
joeyg's Avatar
premier etoile de match
 

Join Date: Dec 2007
Location: Home of world champion Boston Celtics
Posts: 625
Stumble this Post!
Cool perhaps this might help..

I don't think you can do it with the sed function. sed will try to replace condition1 with condition2; does not lend itself to checking inside a listing as you were starting to program.

I used the tr function below. You will need to know the octal values of the characters for your fontset.


Code:
> cat file1 
numero
telefono
vehiculo

> cat file1 | tr "u" "\372" | tr "e" "\351" | tr "i" "\354" >file2
> cat file2
núméro
téléfono
véhìcúlo

> cat file2 | tr "\372" "u" | tr "\351" "e" | tr "\354" "i" >file3
> cat file3
numero
telefono
vehiculo
Reply With Quote
  #3 (permalink)  
Old 07-23-2008
Registered User
 

Join Date: Oct 2006
Location: Belgium
Posts: 171
Stumble this Post!
No magic tool. Just sed:
Code:
sed 's/[àâä]/a/g; s/[ÀÂÄ]/A/g; s/[éèêë]/e/g; s/[ÉÈÊË]/E/g; s/[îï]/i/g;
s/[ÎÏ]/I/g; s/[ôö]/o/g; s/[ÖÔ]/O/g; s/[ûüù]/u/g; s/[ÛÜÙ]/U/g; s/ç/c/g; s/Ç/C/g' your file
If you want sed to change our file "inline" just add the -i switch.

Code:
sed -i 'sed command' your_file
But be carefull: no way back.
Reply With Quote
  #4 (permalink)  
Old 07-23-2008
Registered User
 

Join Date: Jul 2008
Location: BlackMesh Managed Hosting
Posts: 61
Stumble this Post!
My copy of sed supports y//; is that a GNU extension? Seems tailor-made for the problem.
Reply With Quote
  #5 (permalink)  
Old 07-23-2008
Registered User
 

Join Date: Oct 2006
Location: Belgium
Posts: 171
Stumble this Post!
Well, yes. And it's even more elegant:

Code:
SPEC_CHAR="ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿ"
NORM_CHAR="AAAAAAACEEEEIIIIDNOOOOOOUUUUYPSaaaaaaaceeeeiiiionoooooouuuuyby"

sed -i.bk 'y/'$SPEC_CHAR'/'$NORM_CHAR'/' file-to-process
The in-place switch -i will create a backup of the original file with a .bk extension.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 05:30 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0