How to grep � symbol?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to grep � symbol?
# 1  
Old 03-28-2019
How to grep � symbol?

Hello,
I have multiple text files and I need to know which of them are having character issues.
Below command is not working. Maybe instead of that weird string, I should replace it with ascii code.

Code:
grep -A0 "�" file.txt

Thank you
Boris
# 2  
Old 03-28-2019
That - or similar - character is a placeholder for any non-printing character. Where and how did you find it? What are "character issues"? Also, multi-byte chars could be represented. Pls post a hexdump of your data file.
# 3  
Old 03-28-2019
This will work on linux systems:
the -P uses PCRE , the perl regex library, shows the line number (-n) and highlights the problem(s) (--color) It finds characters greater than 127 and so will not work on UTF8 for example

Code:
grep --color='auto' -P -n "[^\x00-\x7F]"  myfile.txt

It always helps to include your OS and shell, this will not work HP-UX for example, and because your used -A I guessed.

Edit: Rudi beat me to it.
This User Gave Thanks to jim mcnamara For This Post:
# 4  
Old 03-28-2019
Hello Rudic and Jim,
It is a subrip file and Jim's answer is very helpful for my case.
Marked as solved.

Thank you!
Boris
# 5  
Old 03-28-2019
Be aware that above will also match / identify / eliminate locale characters. E.g. äöüÄÃ-Üß in the German language.
# 6  
Old 04-06-2019
Hello,
I am back again with the same question.
I am able to detect if it has U+FFFD inside any file but do not know which files have got this issue.

I run:
Code:
printf '%b' "$(printf '\\U%x' {128..131})" | grep -oP "[^\x00-\x7F]"

output:
Code:
�
�
�
�

how may I find it?
PS:
Code:
printf '%b' "$(printf '\\U%x' {128..131})" | grep -HoP "[^\x00-\x7F]"

gives below output:
Code:
(standard input):�
(standard input):�
(standard input):�
(standard input):�


Code:
printf '%b' "$(printf '\\U%x' {128..131})" | grep -loP "[^\x00-\x7F]"

gives only one line output:
Code:
(standard input):�

thank you
Boris

Last edited by baris35; 04-06-2019 at 10:59 AM.. Reason: extra tests done
# 7  
Old 04-06-2019
Not quite sure I understand what failed. grep's option -H gives filenames of all pattern occurrences, -l prints any matching filename just once, which would satisfy you request: identify all files containing non-ASCII characters.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

On £ symbol

Hi All, How do i represent £ symbol in unix and how to retain £ symbol in file. Thanks in Advance (3 Replies)
Discussion started by: HemaV
3 Replies

2. UNIX for Advanced & Expert Users

Undefined Symbol

When I try to link a .so file I get the undefined symbol error. Though I have the library file in my system. The compile and linking commands used are as follows g++ -m64 -g -Wall -I./include -c dxl_sample.c g++ -o firstprogram -m64 -g -Wall -I./include -Bdynamic... (1 Reply)
Discussion started by: Ranadeep Ghosal
1 Replies

3. Programming

Undefined Symbol

When I try to link a .so file I get the undefined symbol error. Though I have the library file in my system. The compile and linking commands used are as follows g++ -m64 -g -Wall -I./include -c dxl_sample.c g++ -o firstprogram -m64 -g -Wall -I./include -Bdynamic... (1 Reply)
Discussion started by: Ranadeep Ghosal
1 Replies

4. UNIX for Dummies Questions & Answers

grep line for string up to symbol

Hi, I would like to extract a pattern from a line. The first two characters will always be the same in this pattern, but the proceeding numbers will not be, and the pattern will always be 6 characters long. I would like to get the entire pattern up to a certain symbol, in this case, a period. ... (3 Replies)
Discussion started by: goodbenito
3 Replies

5. Solaris

/usr/lib/passwdutil.so.1: symbol __nsl_fgetspent_r: referenced symbol not found

deleteing post (0 Replies)
Discussion started by: dshakey
0 Replies

6. Shell Programming and Scripting

how to have ENTER after each symbol.

I want to know script for Input file : 123456789 outputfile : 1 2 3 4 5 6 7 8 9 now please how can generalize it like i want output : 123 456 789 (5 Replies)
Discussion started by: sparkriver
5 Replies

7. UNIX for Dummies Questions & Answers

grep problem with the ^ symbol, help needed please.

Hello I am new to this forum and have been perplexed by this issue for the last hour of my life, I've tried to search everywhere for a solution to my syntax error, but here is my code. grep -wvnf 'noise.dat' sample.out | sort output: 4:Java Swing 5:Swing Java 6:Software Requirements... (2 Replies)
Discussion started by: Gvsumrb
2 Replies

8. UNIX for Dummies Questions & Answers

The > symbol

Hi guys, Im new to unix; I have a problem at hand. Somehow at the terminal, I lost the command prompt, instead I get a ">" symbol. Anything I type in does me no good. What do you recommend? Thank you for your help. (3 Replies)
Discussion started by: csb
3 Replies

9. Programming

Symbol files

How to generate the symbol file for "C" program which will help me in analysing the core dump. Can any one suggest me the tools to use. I don't whether the gcc has the option to generate symbol file while generating the object code. (1 Reply)
Discussion started by: jkolla
1 Replies

10. Programming

c++ unresolved symbol

I have this problem: # make gcc -g -D_REENTRANT -DDISABLE_MJPEG=1 -I. -o encmain.o -c encmain.cc ld: 0711-317 ERROR: Undefined symbol: std::string::_Rep::_S_max_size collect2: ld returned 8 exit status but from /usr/local/include/g++v3/bits/basic_string.h : namespace... (0 Replies)
Discussion started by: thalex
0 Replies
Login or Register to Ask a Question