Sponsored Content
Top Forums Shell Programming and Scripting replace UTF-8 characters with tr Post 302209212 by ripat on Thursday 26th of June 2008 07:43:43 AM
Old 06-26-2008
Thanks but I could'nt get awk to do the job:

Code:
$ echo "Je vais à l'école" | awk '{gsub(/[[=e=]]/, "*")}1'
J* vais à l'écol*

Despite what the man says:
Quote:
Equivalence Classes
An equivalence class is a locale-specific name for a list of characters that are equivalent. The name is
enclosed in [= and =]. For example, the name e might be used to represent all of “e,” “´,” and “`.” In
this case, [[=e=]] is a regular expression that matches any of e, ´, or `.

These features are very valuable in non-English speaking locales. The library functions that gawk uses for regu‐
lar expression matching currently only recognize POSIX character classes; they do not recognize collating symbols
or equivalence classes.
I have a little hope with tr but I will have to wait:
Quote:
Currently `tr' fully supports only single-byte characters.
Eventually it will support multibyte characters; when it does, the `-C'
option will cause it to complement the set of characters, whereas `-c'
will cause it to complement the set of values.
Eventually...

All that noise about i18n Linux is only buzz and I am not good enough in C to change the source code of tr.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Replace Characters...

In a file, How do I replace a set number of characters in each line? For example.... substitute the first 54 characters of each line with mv? Thanks! Lisa (8 Replies)
Discussion started by: lgardner17325
8 Replies

2. Shell Programming and Scripting

Want to replace characters

Hi I have searched for a way to replace odd characters in a FOLDER NAME. All search-and-replace issues I have seen, only involves how to make search-and-replace on a FILE och with TEXT INSIDE a FILE. My problem is with the FOLDER NAME. My case is this: I have a couple of persons that every... (5 Replies)
Discussion started by: arndorff
5 Replies

3. Shell Programming and Scripting

Header Replace characters

Hi, I have a flat file with header with tab delimiter. nbr id name salesid detail num source num jun_2007 jul_2007 aug_2007 sep_2007 ....feb_2008 I need to modify the header for the columns nbr to Id1 jun_2007 to Jun07 jul_2007 to Jul07 aug_2007 to Aug07 sep_2007 to Sep07... (3 Replies)
Discussion started by: umathurumella
3 Replies

4. HP-UX

utf-8, problem with special characters

Hi all, We are facing the following problem in our HP-UX machine: software that manipulates utf-8 encoded strings (e.g. during string cut), fails to correctly manipulate strings (all containing Greek characters) that contain special characters like @, &, # etc. Actually, in different... (3 Replies)
Discussion started by: alina
3 Replies

5. Shell Programming and Scripting

How to replace characters with random characters

I've got a file (numbers.txt) filled with numbers and I want to replace each one of those numbers with a new random number between 0 and 9. This is my script so far: #!/bin/bash rand=$(($RANDOM % 9)) sed -i s//$rand/g numbers.txtThe problem that I have is that it replaces each number with just... (2 Replies)
Discussion started by: hellocatfood
2 Replies

6. Shell Programming and Scripting

how to replace characters using tr

Hi, I have a file which includes some French Characters and I want to change them to other characters like À to &Agrave; Â to &Acirc; É to &Eacute; ..... ..... and so on. I am tyring to use tr command like tr ÀÂÉ &Agrave;&Acirc;&Eacute; < input file But it does not work. Only... (2 Replies)
Discussion started by: naveed
2 Replies

7. Shell Programming and Scripting

Replace special characters with Escape characters?

i need to replace the any special characters with escape characters like below. test!=123-> test\!\=123 !@#$%^&*()-= to be replaced by \!\@\#\$\%\^\&\*\(\)\-\= (8 Replies)
Discussion started by: laknar
8 Replies

8. Linux

Help to Convert file from UNIX UTF-8 to Windows UTF-16

Hi, I have tried to convert a UTF-8 file to windows UTF-16 format file as below from unix machine unix2dos < testing.txt | iconv -f UTF-8 -t UTF-16 > out.txt and i am getting some chinese characters as below which l opened the converted file on windows machine. LANG=en_US.UTF-8... (3 Replies)
Discussion started by: phanidhar6039
3 Replies

9. Shell Programming and Scripting

Convert UTF-8 file to ASCII/ISO8859-1 OR replace characters

I am trying to develop a script which will work on a source UTF-8 file and perform one or more of the following It will accept the target encoding as an argument e.g. US-ASCII or ISO-8859-1, etc 1. It should replace all occurrences of characters outside target character set by " " (space) or... (3 Replies)
Discussion started by: hemkiran.s
3 Replies

10. Shell Programming and Scripting

Replace characters between $ and . with .

Hi - I have below in put to demo.txt /test/xyz/ibcdownload.jsp /test/xyz/pvxprogramtreeovermain.jsp /test/xyz/jtfrsrsr$HtmlTag.jsp /test/xyz/csdronumlov.jsp /test/xyz/iecvaluereset.jsp /test/xyz/ibecumpassignrole.jsp /test/xyz/ozfoffermarketmain.jsp output should be... (4 Replies)
Discussion started by: oraclermanpt
4 Replies
dos2unix(1)						      General Commands Manual						       dos2unix(1)

NAME
dos2unix - DOS/MAC to UNIX text file format converter SYNOPSYS
dos2unix [options] [-c convmode] [-o file ...] [-n infile outfile ...] Options: [-hkqV] [--help] [--keepdate] [--quiet] [--version] DESCRIPTION
This manual page documents dos2unix, the program that converts plain text files in DOS/MAC format to UNIX format. OPTIONS
The following options are available: -h --help Print online help. -k --keepdate Keep the date stamp of output file same as input file. -q --quiet Quiet mode. Suppress all warning and messages. -V --version Prints version information. -c --convmode convmode Sets conversion mode. Simulates dos2unix under SunOS. -o --oldfile file ... Old file mode. Convert the file and write output to it. The program default to run in this mode. Wildcard names may be used. -n --newfile infile outfile ... New file mode. Convert the infile and write output to outfile. File names must be given in pairs and wildcard names should NOT be used or you WILL lost your files. EXAMPLES
Get input from stdin and write output to stdout. dos2unix Convert and replace a.txt. Convert and replace b.txt. dos2unix a.txt b.txt dos2unix -o a.txt b.txt Convert and replace a.txt in ASCII conversion mode. Convert and replace b.txt in ISO conversion mode. Convert c.txt from Mac to Unix ascii format. dos2unix a.txt -c iso b.txt dos2unix -c ascii a.txt -c iso b.txt dos2unix -c mac a.txt b.txt Convert and replace a.txt while keeping original date stamp. dos2unix -k a.txt dos2unix -k -o a.txt Convert a.txt and write to e.txt. dos2unix -n a.txt e.txt Convert a.txt and write to e.txt, keep date stamp of e.txt same as a.txt. dos2unix -k -n a.txt e.txt Convert and replace a.txt. Convert b.txt and write to e.txt. dos2unix a.txt -n b.txt e.txt dos2unix -o a.txt -n b.txt e.txt Convert c.txt and write to e.txt. Convert and replace a.txt. Convert and replace b.txt. Convert d.txt and write to f.txt. dos2unix -n c.txt e.txt -o a.txt b.txt -n d.txt f.txt DIAGNOSTICS
BUGS
The program does not work properly under MSDOS in stdio processing mode. If you know why is that so, please tell me. AUTHORS
Benjamin Lin - <blin@socs.uts.edu.au> Bernd Johannes Wuebben (mac2unix mode) <wuebben@kde.org> MISCELLANY
Tested environment: Linux 1.2.0 with GNU C 2.5.8 SunOS 4.1.3 with GNU C 2.6.3 MS-DOS 6.20 with Borland C++ 4.02 Suggestions and bug reports are welcome. SEE ALSO
unix2dos(1) mac2unix(1) 1995.03.31 dos2unix v3.0 dos2unix(1)
All times are GMT -4. The time now is 10:47 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy