Sponsored Content
Top Forums Shell Programming and Scripting Replacing French special characters Post 302213041 by BlueberryPickle on Wednesday 9th of July 2008 04:34:45 AM
Old 07-09-2008
Replacing French special characters

Hi,

I have tonnes of .txt files that are written in French. I need to replace the French special characters, however, with English equivalents (e.g. é -> e and ç -> c).

I have tried this

---

#!/bin/bash
# Convert French characters to normal characters

# Treat each of the files

exec 3<&0
exec 0<frenchCharacters.txt

while read currentFrenchCharacter
do
read currentReplacementCharacter
sed -e "s/$currentFrenchCharacter/$currentReplacementCharacter/g" $1 > $1.frenchCharactersReplaced
mv $1.frenchCharactersReplaced $1
done

# Close the file
exec 3<&0

---

where "frenchCharacters.txt" contains a list of characters, where the first is the character to find and the second is the character to replace it with.

The problem is that it doesn't make any changes to the file that I send in (stored in $1). Anyone know why? Also, anyone know of a better way to do this?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

help on sed replacing special characters

Hello, I have a file with many lines with below format: \abc\\1234 jkl\\567 def\\345 \pqr\\567 \xyz\\234 Here, i need to do 2 things. 1. replace \\ with \ 2. remove starting \ so output to be as below: (11 Replies)
Discussion started by: prvnrk
11 Replies

2. HP-UX

Problems with French Characters

I am having a problem with two OSes. One is running windows 2003 and sending XML to a second system running Unix (HP-UX 11i v1). Windows sends XML to the UNIX system fine but then the UNIX system reads the buffer file and turns the french characters into the following: é Ú É ╔ Î ... (3 Replies)
Discussion started by: Redfin
3 Replies

3. Shell Programming and Scripting

Replacing string with special characters in shell

Hi, I am trying to replace a string in shell but it is not working correctly. @xcom.file@ needs to be replaced with tb137 Plz help.Thx. Please use and tags when posting code, data or logs etc. to preserve formatting and enhance readability, thanks. (4 Replies)
Discussion started by: manish72
4 Replies

4. Shell Programming and Scripting

Need help in replacing special characters

I am writing a ksh script. I need to replace a set of characters in an xml file. FROM="ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÛÚÜÝßàáâãäåçèéêëìíîïðñòóôõö¿¶ø®"; TO="AAAAAAACEEEEIIIIDNOOOOOOUUUUYSaaaaaaceeeeiiiionooooo N R" I have used the code- sed 's/$FROM/$TO/g'<abc.xml But its not working. Can anyone tell me the code to do this? (3 Replies)
Discussion started by: saga20
3 Replies

5. Shell Programming and Scripting

sed replacing specific characters and control characters by escaping

sed -e "s// /g" old.txt > new.txt While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies

6. UNIX for Dummies Questions & Answers

French characters in postfix/sendmail

Hello again, How can I send emails via postfix with special characters like "à" via postfix. When I'm paste-ing the special character inside a editor (nano) it shows like this --> � ... any tips? (1 Reply)
Discussion started by: galford
1 Replies

7. UNIX for Dummies Questions & Answers

Replacing valuses containig space and special characters

**Extremely sorry for the typos in heading Old:CAST ('${DEFAULT_HIGH_DATE}' AS DATE FORMAT 'YYYY-MM-DD') New :CAST(CAST('${G_DEFAULT_HIGH_DATE}' AS DATE FORMAT 'MM-DD-YYYY') as DATE FORMAT 'YYYY-MM-DD') Need to change old format as new format cat file1 CAST ('${DEFAULT_HIGH_DATE}' AS... (1 Reply)
Discussion started by: 100bees
1 Replies

8. UNIX for Beginners Questions & Answers

Replacing string/special characters using a 'conversion' table

Hi, Does anyone know if there is a script or program available out there that uses a conversion table to replace special characters from a file? I am trying to remove some special characters from a file but there are several unprintable/control characters that some I need to remove but some I... (2 Replies)
Discussion started by: newbie_01
2 Replies

9. AIX

French Accented characters in xml file comes as numbers

Hello all, I am using AIX 7.1 and whenever xml files with accented French characters are read, for example Name Andree where the first e has accented mark on top, AIX should it as Andrée but it comes as funny number characters for the first e. What do I need to fix this. I want to test with one... (5 Replies)
Discussion started by: pregmi
5 Replies

10. UNIX for Beginners Questions & Answers

Any tip to replacing the special characters in a file

Hi, Please find attached a file that has special characters on it. It is a copy and paste from a Micro$oft file. I don't want to use strings as it remove all the 'indentations' / 'formatting' so I am replacing them with space instead. I am using the sed command below sed "s/$(printf... (1 Reply)
Discussion started by: newbie_01
1 Replies
French(5)							File Formats Manual							 French(5)

NAME
French, french - Introduction to French language support DESCRIPTION
This reference page describes the coded character set (codeset), locale, device, and other support for the French language. Codesets The operating system supports the following codesets for French by means of locales, codeset converters, or both: ISO 8859-1 (ISO Latin-1) ISO8859-1 is the string that represents this codeset in the names of locales and codeset converters. See iso8859-1(5) for more information. ISO 8859-15 (ISO Latin-9) ISO8859-15 is the string that represents this codeset in the names of locales and codeset converters. See iso8859-15(5) for more information. UCS-2, UCS-4, and UTF-8 UCS-2, UCS-4, ucs4, and UTF-8 are the strings that represent these encoding formats in the names of locales and codeset converters. See Unicode(5) for more information. PC code-page formats cp437, cp850, cp863, and cp1252 are the strings that represent these encoding formats in the names of locales and codeset convert- ers. See code_page(5) for more information. See the i18n_intro(5) and l10n_intro(5) reference pages for introductory information on codesets. The iconv_intro(5) reference page dis- cusses codeset converters and how to use them. French Locales The operating system provides the following French locales for different countries: fr_BE.ISO8859-1, fr_BE.ISO8859-15, and fr_BE.UTF-8 for Belgium fr_CA.ISO8859-1 and fr_CA.ISO8859-15 for Canada fr_FR.ISO8859-1, fr_FR.ISO8859-15, and fr_FR.UTF-8 for France Of these locales, only the and locales support the Eeuro currency sign. You can use the locale command (see locale(1)) to find out if these locales are installed on your system. See i18n_intro(5) for information on setting locale from the operating system command line. For the Common Desktop Environment (CDE), you set locale by setting the session language. To do this, use the Language menu accessed from the Options button on the Login window. Keyboards The operating system supports the following VT style and PC style keyboards with French characters printed on the keys: ---------------------------------------------- VT Style (105/108 keys) PC Style (102 keys) ---------------------------------------------- LK201-LP LK442-PP LK401-AP LK444-AP LK411-AP LK471-AP LK450-AP LK47W-AP LK451-AP LK97W-AP LK461-AP PCXAL-AP LK46W-AP PCXAL-FP PCXAL-GP PCXAL-KP ---------------------------------------------- For your keyboard to function correctly with your system, you must load a keyboard mapping table (keymap) that is appropriate for your key- board's model and language. If you load a keymap that does not correspond to your keyboard's model and language, your keyboard behavior is unpredictable. The label located on the bottom surface of a keyboard usually specifies its model (five letter code) and language (two let- ter code). See the keyboard(5) reference page for general information on keymaps and instructions for loading them in different formats. The following tables supply French-specific information that you need when loading keymaps. Selecting keymaps in xkb format: --------------------------------------------------------- For VT Style For PC Style Keyboard: Select: Keyboard: Select: --------------------------------------------------------- LK201-LP lk201 LK442-PP lk442 or lk44x LK401-AP lk401 LK444-AP lk444 or lk44x LK411-AP lk411 LK471-AP lk471ap or lk471 LK450-AP lk450 LK47W-AP lk471ap or lk471 LK451-AP lk450 LK97W-AP lk97wap or lk97w LK461-AP lk461 PCXAL-AP pcxalap LK46W-AP lk461 PCXAL-FP pcxalfp PCXAL-GP pcxalgp PCXAL-KP pcxalkp --------------------------------------------------------- Selecting keymaps in xmodmap format: --------------------------------------------------------------------- For VT Style For PC Style Keyboard: Select: Keyboard: Select: --------------------------------------------------------------------- LK201-LP belgian french LK442-PP french lk444kp lk201lp dp belgian french LK444-AP french lk444kp lk201lp tw LK401-AP belgian french LK471-AP french pcxalgp lk401ap dp belgian french LK47W-AP french pcxalgp lk401ap tw LK411-AP french lk411ap PCXAL-AP french pcxalgp LK450-AP french lk411ap PCXAL-FP french pcxalgp LK451-AP french lk411ap PCXAL-GP french pcxalgp LK461-AP french lk411ap PCXAL-KP french pcxalgp LK46W-AP french lk411ap --------------------------------------------------------------------- Keyboards can have keys with characters printed on both the left and right half of the keycap. The way you set or use your keyboard to send different sets of characters varies from one keyboard model to another. Furthermore, your keyboard allows you to enter more characters than those printed on the keycaps. Refer to the keyboard(5) reference page for information on how to enter characters. Printers PostScript fonts for languages supported by the ISO 8859-1 codeset are printer resident. SEE ALSO
Commands:locale(1), lp(1) Others: code_page(5), euro(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), iso8859-1(5), iso8859-15(5), keyboard(5), l10n_intro(5), Unicode(5) Writing Software for the International Market French(5)
All times are GMT -4. The time now is 11:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy