Sponsored Content
Top Forums Shell Programming and Scripting How to remove special characters? Post 302830825 by Don Cragun on Tuesday 9th of July 2013 02:32:00 PM
Old 07-09-2013
Quote:
Originally Posted by ken6503
Hi Gurus,

I have file which contains some unicode charachator like "ü". I want to replace it with some charactors. I searched in internet and got command
Code:
 sed "s/ü/-/g"

, but I don't know how to type ü in unix command line.
Please help me for this one.

Thanks in advance
I guess I don't understand the problem. You have created the command line you want to use above. Why can't you just copy it and paste it into your shell? Or copy it and paste it into a shell script using your editor?

The way you type unicode characters using a keyboard will vary depending on your operating system, your keyboard, and your current locale settings, but as long as your current locale and the character you're copying are both using the same underlying codeset, copy and paste should work.

Note that on most UNIX and Linux systems there won't be a locale that uses Unicode as the underlying codeset, but there are probably several that use UTF-8 (which is a multi-byte codeset that can encode any Unicode character).

Note that whether ü is a single-byte character (as it is in some EBCDIC code page variants and some ISO 8859-* codesets) or a multi-byte character (as it is in UTF-8) shouldn't matter to sed. The sed utility operates on characters; not bytes. You just need to be sure that the locale you're using when running sed is using a codeset with the same encoding for ü as the encoding used in the file you're editing.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

remove special and unicode characters

Hi, How do I remove the lines where special characters or Unicode characters appear? The following query does work but I wonder if there is a better way. cat test.txt | egrep -v '\)|#|,|&|-|\(|\\|\/|\.' The following lines show that my query is incomplete. Warning: The word "*Khan" is... (1 Reply)
Discussion started by: shantanuo
1 Replies

2. Shell Programming and Scripting

Remove special characters from string

Hi there, I'd like to write a script that removes any set of character from any string. The first argument would be the string, the second argument would be the characters to remove. For example: $ myscript "My name's Santiago. What's yours?" "atu" My nme's Snigo. Wh's yors? I wrote the... (11 Replies)
Discussion started by: chebarbudo
11 Replies

3. Shell Programming and Scripting

How to remove special characters from each line?

Hello, Is there a simpler way to remove special characters (color codes) from each lines in a log file? I use sed like in the example below but I think there should be a more simple way to achieve the same result: $ cat -vet file1 ^, , , , Maybe to convert the file somehow? ... (5 Replies)
Discussion started by: majormark
5 Replies

4. UNIX for Dummies Questions & Answers

How to Remove Special Characters

Dear Members, We have a file which contains some special characters. I need to replace these special character by a new line character(\n). The Special character is \x85. I am not sure what this character means and how we can remove it. Any inputs are greatly appreciated. Thanks... (5 Replies)
Discussion started by: sandeep_1105
5 Replies

5. UNIX for Dummies Questions & Answers

Files with special characters - how to remove

Hi, I have a directory that has a file which contained special characters in the filename. Can someone please advise how to remove the file, preferably with a rm -i ? Thanks in advance. Listing is as below: {oracle}> ls -1b bplog.bkup.001 bplog.bkup.002 bplog.bkup.003 bplog.bkup.004... (1 Reply)
Discussion started by: newbie_01
1 Replies

6. Shell Programming and Scripting

remove special characters

hello all I am writing a perl code and i wish to remove the special characters for text. I wish to remove all extended ascii characters. If the list of special characters is huge, how can i do this using substitute command s/specialcharacters/null/g I really want to code like... (3 Replies)
Discussion started by: vasuarjula
3 Replies

7. Shell Programming and Scripting

Remove string between two special characters

Hi All, I have a variable like AVAIL="\ BACK:bkpstg:testdb3.iad.expertcity.com:backtest|\ #AUTH:authstg:testdb3.iad.expertcity.com:authiapd|\ TEST:authstg:testdb3.iad.expertcity.com:authiapd|\ " What I want to do here is that If a find # before any entry, remove the entire string... (5 Replies)
Discussion started by: engineermayur
5 Replies

8. Shell Programming and Scripting

Remove the special characters from field

Hi, In source data few of columns are having special charates(like *) due to this i am not able to display the data into flat file.it's displaying the some of junk data into the flat file. source dataExample: Address1="XDERFTG * HYJUYTG" how to remove the special charates in a string (2 Replies)
Discussion started by: koti_rama
2 Replies

9. Shell Programming and Scripting

Sed - remove special characters

Hi, I have a file with this line, it's always in the first line: I want to remove these special characters: ´╗┐ file1 ´╗┐\\bar\c$\test2\;3.348.118 Bytes;160 ;3 \\bar\c$\test\;35 Bytes;2 ;1 I want the same file to be only \\bar\c$\test2\;3.348.118 Bytes;160 ;3 \\bar\c$\test\;35... (4 Replies)
Discussion started by: nakaedu
4 Replies

10. Shell Programming and Scripting

How to remove some special characters in a string?

Hi, I have string like this ="Lookup Procedure" But i want the output like this Lookup Procedure =," should be removed. Please suggest me the solution. Regards, Madhuri (2 Replies)
Discussion started by: srimadhuri
2 Replies
Russian(5)							File Formats Manual							Russian(5)

NAME
Russian, russian - Introduction to Russian language support DESCRIPTION
This reference page describes the coded character set (codeset), locale, device, and other kinds of support for the Russian language. Codesets The operating system supports the following codesets for Russian by means of locales, codeset converters, or both: ISO 8859-5 (ISO Latin/Cyrillic) ISO8859-5 is the string that represents this codeset in the names of locales and codeset converters. See iso8859-5(5) for more information. UCS-2, UCS-4, and UTF-8 UCS-2, UCS-4, ucs4, and UTF-8 are the strings that represent these encoding formats in the names of locales and codeset converters. See Unicode(5) for more information. PC code pages cp855, cp866, and cp1251 are the strings that represent these encoding formats in the names of codeset converters. See code_page(5) for more information. See the i18n_intro(5) and l10n_intro(5) reference pages for introductory information on codesets. The iconv_intro(5) reference page dis- cusses codeset converters and how to use them. Locales The operating system provides the following Russian locale: ru_RU.ISO8859-5, for Russia The ru_RU.ISO8859-5@ucs4 variant of this locale exists for applications that need to convert file data in ISO8859-5 format to UCS-4 process code for certain character-classification operations. You can use the locale command (see locale(1)) to find out which locales are installed on your system. See i18n_intro(5) for information on setting locale from the operating system command line. For the Common Desktop Environment (CDE), you set locale by setting the session language. To do this, use the Language menu accessed from the Options button of the Login window. Keyboards The operating system supports the following VT style and PC style keyboards with Russian characters printed on the keys: ---------------------------------------------- VT Style (105/108 keys) PC Style (102 keys) ---------------------------------------------- LK401-BT LK471-BT LK411-BT LK47W-BT LK461-BT LK97W-BT LK46W-BT PCXAL-HT PCXAL-LT PCXAL-PT PCXAL-QT PCXAL-ST ---------------------------------------------- For your keyboard to function correctly with your system, you must load a keyboard mapping table (keymap) that is appropriate for your key- board's model and language. If you load a keymap that does not correspond to your keyboard's model and language, your keyboard behavior is unpredictable. The label located on the bottom surface of a keyboard usually specifies its model (five letter code) and language (two let- ter code). See the keyboard(5) reference page for general information on keymaps and instructions for loading them in different formats. The following tables supply Russian-specific information that you need when loading keymaps. Selecting keymaps in xkb format: --------------------------------------------------------- For VT Style For PC Style Keyboard: Select: Keyboard: Select: --------------------------------------------------------- LK401-BT lk401 LK471-BT lk471bt or lk471 LK411-BT lk411 LK47W-BT lk471bt or lk471 LK461-BT lk461 LK97W-BT lk97wbt or lk97w LK46W-BT lk461 PCXAL-HT pcxalht PCXAL-LT pcxallt PCXAL-PT pcxalpt PCXAL-QT pcxalqt PCXAL-ST pcxalst --------------------------------------------------------- Selecting keymaps in xmodmap format: ---------------------------------------------------------------- For VT Style For PC Style Keyboard: Select: Keyboard: Select: ---------------------------------------------------------------- LK401-BT russian lk401bt LK471-BT russian pcxalht LK411-BT russian lk411bt LK47W-BT russian pcxalht LK461-BT russian lk411bt PCXAL-HT russian pcxalht LK46W-BT russian lk411bt PCXAL-LT russian pcxallt PCXAL-PT russian pcxallt PCXAL-QT russian pcxallt PCXAL-ST russian pcxallt ---------------------------------------------------------------- Keyboards can have keys with characters printed on both the left and right half of the keycap. The way you set or use your keyboard to send different sets of characters varies from one keyboard model to another. Furthermore, your keyboard allows you to enter more characters than those printed on the keycaps. Refer to the keyboard(5) reference page for information on how to enter characters. Printers The PostScript fonts available for languages supported by the ISO 8859-5 codeset are listed on the iso8859-5(5) reference page. Refer to the i18n_printing(5) reference page for a discussion of printer support options. SEE ALSO
Commands: locale(1) Others: code_page(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), iso8859-5(5), keyboard(5), l10n_intro(5), Unicode(5) Writing Software for the International Market Russian(5)
All times are GMT -4. The time now is 05:01 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy