Sponsored Content
Top Forums UNIX for Dummies Questions & Answers remove special and unicode characters Post 302262810 by shantanuo on Saturday 29th of November 2008 02:35:27 AM
Old 11-29-2008
remove special and unicode characters

Hi,
How do I remove the lines where special characters or Unicode characters appear?
The following query does work but I wonder if there is a better way.

cat test.txt | egrep -v '\)|#|,|&|-|\(|\\|\/|\.'

The following lines show that my query is incomplete.

Warning: The word "*Khan" is invalid. The character '*' (U+2A) may not appear at the beginning of a word. Skipping word.
Warning: The word "Khan]" is invalid. The character ']' (U+5D) may not appear at the end of a word. Skipping word.
Warning: The word "Khandewa;l" is invalid. The character ';' (U+3B) may not appear in the middle of a word. Skipping word.
Warning: The word "[khanna" is invalid. The character '[' (U+5B) may not appear at the beginning of a word. Skipping word.
Warning: The word "Khar**Closed" is invalid. The character '*' (U+2A) may not appear in the middle of a word. Skipping word.
Warning: The word "Khelani]" is invalid. The character ']' (U+5D) may not appear at the end of a word. Skipping word.
Warning: The word "Khwaja[physician]" is invalid. The character '[' (U+5B) may not appear in the middle of a word. Skipping w
ord.
Warning: The word "Kids@play" is invalid. The character '@' (U+40) may not appear in the middle of a word. Skipping word.
 

10 More Discussions You Might Find Interesting

1. Programming

How to display unicode characters / unicode string

I have a stream of characters like "\u8BBE\u5907\u7BA1" and i want to display it. I tried following things already without any luck. 1) printf("%s",L("\u8BBE\u5907\u7BA1")); 2) printf("%lc",0x8BBE); 3) setlocale followed by fwide followed by wprintf 4) also changed the local manually... (3 Replies)
Discussion started by: jackdorso
3 Replies

2. UNIX for Dummies Questions & Answers

Remove directory that has special Characters

Hi All, I have a script written that creates a new directory within the shell program and if a parameter isn't passed in, it creates a strange directory name by mistake. So I have a directory like "-_12" and I am unable to remove it. I tried removing it using double quote and many others. I have... (12 Replies)
Discussion started by: datherriault
12 Replies

3. UNIX for Dummies Questions & Answers

How to Remove Special Characters

Dear Members, We have a file which contains some special characters. I need to replace these special character by a new line character(\n). The Special character is \x85. I am not sure what this character means and how we can remove it. Any inputs are greatly appreciated. Thanks... (5 Replies)
Discussion started by: sandeep_1105
5 Replies

4. UNIX for Dummies Questions & Answers

Files with special characters - how to remove

Hi, I have a directory that has a file which contained special characters in the filename. Can someone please advise how to remove the file, preferably with a rm -i ? Thanks in advance. Listing is as below: {oracle}> ls -1b bplog.bkup.001 bplog.bkup.002 bplog.bkup.003 bplog.bkup.004... (1 Reply)
Discussion started by: newbie_01
1 Replies

5. Shell Programming and Scripting

remove special characters

hello all I am writing a perl code and i wish to remove the special characters for text. I wish to remove all extended ascii characters. If the list of special characters is huge, how can i do this using substitute command s/specialcharacters/null/g I really want to code like... (3 Replies)
Discussion started by: vasuarjula
3 Replies

6. UNIX for Dummies Questions & Answers

Remove Unicode/special chars from XML

Hi, We are receiving an XML file in Unix which has some special characters between tags like '^' etc <Tag> 1e^O7f%<2304e.$d8f57e8^Bf-&e.^Zh7/327e^O7 </Tag> We need to remove all special characters like ^ ones and also any '&' or '<' or '>' being sent within the start and close tags i.e.... (6 Replies)
Discussion started by: dsrookie7
6 Replies

7. Shell Programming and Scripting

Remove the special characters from field

Hi, In source data few of columns are having special charates(like *) due to this i am not able to display the data into flat file.it's displaying the some of junk data into the flat file. source dataExample: Address1="XDERFTG * HYJUYTG" how to remove the special charates in a string (2 Replies)
Discussion started by: koti_rama
2 Replies

8. Shell Programming and Scripting

How to remove some special characters in a string?

Hi, I have string like this ="Lookup Procedure" But i want the output like this Lookup Procedure =," should be removed. Please suggest me the solution. Regards, Madhuri (2 Replies)
Discussion started by: srimadhuri
2 Replies

9. Shell Programming and Scripting

How to remove special characters?

Hi Gurus, I have file which contains some unicode charachator like "ü". I want to replace it with some charactors. I searched in internet and got command sed "s/ü/-/g", but I don't know how to type ü in unix command line. Please help me for this one. Thanks in advance (7 Replies)
Discussion started by: ken6503
7 Replies

10. Shell Programming and Scripting

Remove Special Characters Within Text

Hi, I have a "|" delimited file that is exported from a database. There is one column in the file which has description/comments entered by some application user. It has "Control-M" character and "New Line" character in between the text. Hence, when i export the data, this record with the new... (4 Replies)
Discussion started by: tarun.trehan
4 Replies
vlatai(1L)																vlatai(1L)

NAME
vlatai - Lojban word analyzer SYNOPSIS
vlatai [ -v ] [ word ] DESCRIPTION
vlatai is a program that reads a list of Lojban words from standard input, one word per line. In its usual mode, it writes a single output line for each input line, advising of which kind of Lojban word the input line is. If the input word has cmavo prefixed onto it, vlatai shows the individual cmavo and the base word itself. In the verbose mode (obtained using -v), the workings of the internal state machines used for scanning the word are exposed. This may give some insight into where an error lies on words that can't be parsed, especially if the output is used in conjuction with the report files built during the compilation process. Note that vlatai is primarily a testbench for the word categorizer/splitter inside jbofihe. vlatai is not fully supported or documented as a stand-alone program. OPTIONS
-v Verbose; show lots of information about the evolution of the internal state word The word to check. If no 'word' argument is supplied, the program reads words from standard input, one word per line (with no whitespace around it.) REFERENCES
http://go.to/jbofihe Home page for the jbofihe project (of which cmafihe is part) http://www.lojban.org/ Home page of the Lojban community http://www.rrbcurnow.freeuk.com/lojban/ My Lojban page. AUTHOR
Richard Curnow <rpc@myself.com> December 2000 vlatai(1L)
All times are GMT -4. The time now is 08:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy