07-09-2008
Replacing French special characters
Hi,
I have tonnes of .txt files that are written in French. I need to replace the French special characters, however, with English equivalents (e.g. é -> e and ç -> c).
I have tried this
---
#!/bin/bash
# Convert French characters to normal characters
# Treat each of the files
exec 3<&0
exec 0<frenchCharacters.txt
while read currentFrenchCharacter
do
read currentReplacementCharacter
sed -e "s/$currentFrenchCharacter/$currentReplacementCharacter/g" $1 > $1.frenchCharactersReplaced
mv $1.frenchCharactersReplaced $1
done
# Close the file
exec 3<&0
---
where "frenchCharacters.txt" contains a list of characters, where the first is the character to find and the second is the character to replace it with.
The problem is that it doesn't make any changes to the file that I send in (stored in $1). Anyone know why? Also, anyone know of a better way to do this?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hello,
I have a file with many lines with below format:
\abc\\1234
jkl\\567
def\\345
\pqr\\567
\xyz\\234
Here, i need to do 2 things.
1. replace \\ with \
2. remove starting \
so output to be as below: (11 Replies)
Discussion started by: prvnrk
11 Replies
2. HP-UX
I am having a problem with two OSes. One is running windows 2003 and sending XML to a second system running Unix (HP-UX 11i v1). Windows sends XML to the UNIX system fine but then the UNIX system reads the buffer file and turns the french characters into the following:
é Ú
É ╔
Î ... (3 Replies)
Discussion started by: Redfin
3 Replies
3. Shell Programming and Scripting
Hi,
I am trying to replace a string in shell but it is not working correctly.
@xcom.file@
needs to be replaced with
tb137
Plz help.Thx.
Please use and tags when posting code, data or logs etc. to preserve formatting and enhance readability, thanks. (4 Replies)
Discussion started by: manish72
4 Replies
4. Shell Programming and Scripting
I am writing a ksh script. I need to replace a set of characters in an xml file.
FROM="ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÛÚÜÝßàáâãäåçèéêëìíîïðñòóôõö¿¶ø®";
TO="AAAAAAACEEEEIIIIDNOOOOOOUUUUYSaaaaaaceeeeiiiionooooo N R"
I have used the code- sed 's/$FROM/$TO/g'<abc.xml
But its not working.
Can anyone tell me the code to do this? (3 Replies)
Discussion started by: saga20
3 Replies
5. Shell Programming and Scripting
sed -e "s// /g" old.txt > new.txt
While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies
6. UNIX for Dummies Questions & Answers
Hello again,
How can I send emails via postfix with special characters like "à" via postfix.
When I'm paste-ing the special character inside a editor (nano) it shows like this --> � ... any tips? (1 Reply)
Discussion started by: galford
1 Replies
7. UNIX for Dummies Questions & Answers
**Extremely sorry for the typos in heading
Old:CAST ('${DEFAULT_HIGH_DATE}' AS DATE FORMAT 'YYYY-MM-DD')
New :CAST(CAST('${G_DEFAULT_HIGH_DATE}' AS DATE FORMAT 'MM-DD-YYYY') as DATE FORMAT 'YYYY-MM-DD')
Need to change old format as new format
cat file1
CAST ('${DEFAULT_HIGH_DATE}' AS... (1 Reply)
Discussion started by: 100bees
1 Replies
8. UNIX for Beginners Questions & Answers
Hi,
Does anyone know if there is a script or program available out there that uses a conversion table to replace special characters from a file?
I am trying to remove some special characters from a file but there are several unprintable/control characters that some I need to remove but some I... (2 Replies)
Discussion started by: newbie_01
2 Replies
9. AIX
Hello all, I am using AIX 7.1 and whenever xml files with accented French characters are read, for example Name Andree where the first e has accented mark on top, AIX should it as Andrée but it comes as funny number characters for the first e. What do I need to fix this. I want to test with one... (5 Replies)
Discussion started by: pregmi
5 Replies
10. UNIX for Beginners Questions & Answers
Hi,
Please find attached a file that has special characters on it. It is a copy and paste from a Micro$oft file.
I don't want to use strings as it remove all the 'indentations' / 'formatting' so I am replacing them with space instead.
I am using the sed command below
sed "s/$(printf... (1 Reply)
Discussion started by: newbie_01
1 Replies
paste(1) General Commands Manual paste(1)
Name
paste - merge file data
Syntax
paste file1 file2...
paste -dlist file1 file2...
paste -s [-dlist] file1 file2...
Description
In the first two forms, concatenates corresponding lines of the given input files file1, file2, etc. It treats each file as a column or
columns of a table and pastes them together horizontally (parallel merging).
In the last form, the command combines subsequent lines of the input file (serial merging).
In all cases, lines are glued together with the tab character, or with characters from an optionally specified list. Output is to the
standard output, so it can be used as the start of a pipe, or as a filter, if - is used in place of a file name.
Options
- Used in place of any file name, to read a line from the standard input. (There is no prompting).
-dlist Replaces characters of all but last file with nontabs characters (default tab). One or more characters immediately following -d
replace the default tab as the line concatenation character. The list is used circularly, i. e. when exhausted, it is reused. In
parallel merging (i. e. no -s option), the lines from the last file are always terminated with a new-line character, not from the
list. The list may contain the special escape sequences:
(new-line), (tab), \ (backslash), and (empty string, not a null
character). Quoting may be necessary, if characters have special meaning to the shell (for example, to get one backslash, use
-d"\\" ).
Without this option, the new-line characters of each but the last file (or last line in case of the -s option) are replaced by a
tab character. This option allows replacing the tab character by one or more alternate characters (see below).
-s Merges subsequent lines rather than one from each input file. Use tab for concatenation, unless a list is specified with -d
option. Regardless of the list, the very last character of the file is forced to be a new-line.
Examples
ls | paste -d" " -
list directory in one column
ls | paste - - - -
list directory in four columns
paste -s -d"
" file
combine pairs of lines into lines
Diagnostics
line too long
Output lines are restricted to 511 characters.
too many files
Except for -s option, no more than 12 input files may be specified.
See Also
cut(1), grep(1), pr(1)
paste(1)