Grep to remove non-ASCII characters


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep to remove non-ASCII characters
# 1  
Old 11-09-2013
Grep to remove non-ASCII characters

I have been having an encoding problem that I need to solve.
I have an 4-column tab-separated file: I need to remove all of the lines that contain the string 'vis-à-vis'

Code:
achiever-n    vis-à-vis+ns-j+vp    oppose-v    1
achiever-n    vis-à-vis+ns-the+vg    assess-v    1
administrator-n    vis-à-vis+n-the+n    position-n    1
adobe-n    vis-à-vis+n-a-j+n-a-j    ad-n    1

In this way, if my file contains 4 lines that contain 'vis-à-vis' they will all be filterd.
How can I do this with a one liner grep?

---------- Post updated at 01:18 PM ---------- Previous update was at 01:09 PM ----------

or I need something that removes all non-ascii characters..

or that does the opposite of this grep

Code:
grep --color='auto' -P -n '[^\x00-\x7F]' file

I have tried
Code:
grep --color='auto' -P -n '![^\x00-\x7F]' file

with no success

Last edited by owwow14; 11-09-2013 at 02:23 PM..
# 2  
Old 11-09-2013
Try:

Code:
grep --color='auto' -P -v -n '[^\x00-\x7F]' file

# 3  
Old 11-09-2013
I have tried that option too -- and it doesn't seem to remove anything... just reprints out the entire input file...
# 4  
Old 11-09-2013
Code:
egrep -v "[ -~]" file

# 5  
Old 11-11-2013
Thank you for the responses - they both just seem to work just fine.

Last edited by owwow14; 11-11-2013 at 05:07 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Lower ASCII characters.

Hi, I'm writing a BBS telnet program. I'm having issues with it not displaying lower ASCII characters. For example, instead of displaying the "smiley face" character (Ctrl-B), it displays ^B. Is this because i'm using Ncurses? If so, is there any way around this? Thanks. (3 Replies)
Discussion started by: ignatius
3 Replies

2. Shell Programming and Scripting

Grep to remove and add specified characters

I have the following type of 2 column file: motility - role - supplementation - age b ancestry b purity b recommendation b serenity b unease b carving f expansion f I would like to print only certain sections of the file depending on the value of the second column. For instance,... (6 Replies)
Discussion started by: owwow14
6 Replies

3. Shell Programming and Scripting

Removing these non-ASCII characters from a file

Hi, I have many text files which contain some non-ASCII characters. I attach the screenshots of one of the files for people to have a look at. The issue is even after issuing the non-ASCII removal commands one of the characters does not go away. The character that goes away is the black one with a... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

4. Shell Programming and Scripting

grep or sed. How to remove certain characters

Here is my problem. I have a list of phone numbers that I want to use only the last 4 digits as PINs for something I am working on. I have all the numbers in a file but now I want to be removed all items EXCEPT the last 4 digits. I have seen sed commands and some grep commands but I am... (10 Replies)
Discussion started by: Sucio
10 Replies

5. Shell Programming and Scripting

New line characters in Ascii file

I am having a file(1234.txt) downloaded from windows server (in Ascii format).However when i ftp this file to Unix server and try to work with it..i am unable to do anything.When i try to open the file using vi editor the file opens in the following format ... @ @ @ @ @ @ @ @... (4 Replies)
Discussion started by: appu2176
4 Replies

6. Shell Programming and Scripting

convert ascii values into ascii characters

Hi gurus, I have a file in unix with ascii values. I need to convert all the ascii values in the file to ascii characters. File contains nearly 20000 records with ascii values. (10 Replies)
Discussion started by: sandeeppvk
10 Replies

7. Shell Programming and Scripting

Multibyte characters to ASCII

Hello, Is there any UNIX utility/command/executable that will convert mutlibyte characters to standard single byte ASCII characters in a given file? and Is there any UNIX utility/command/executable that will recognize multibyte characters in a given file name? The typical multibyte... (8 Replies)
Discussion started by: jerardfjay
8 Replies

8. HP-UX

Hex characters of ascii file

Hi, Whats the command or how do you display the hexadecimal characters of an ascii file. thanks Bud (2 Replies)
Discussion started by: budrito
2 Replies

9. Shell Programming and Scripting

Replace characters in a string using their ascii value

Hi All, In the HP Unix that i'm using when i initialise a string as Stalled="'30¬G'" Stalled=$Stalled" '30¬C'", it is taking the character ¬ as a comma. I need to grep for 30¬G 30¬C in a file and take its count. But since this character ¬ is not being understood, the count returns a zero. The... (2 Replies)
Discussion started by: roops
2 Replies

10. Programming

stupid question about ascii characters

i know it's out there, but I cannot remember how to check if a given ascii character string contains all digits or not ... any ideas? ie...function("123") --> OK function("NOT_A_NUMBER") --> returns error thanks!! (2 Replies)
Discussion started by: jalburger
2 Replies
Login or Register to Ask a Question