ksh check for non printable characters in a string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting ksh check for non printable characters in a string
# 8  
Old 02-12-2015
Hi Don,

I think there is some problem with this syntax.

test.ksh
Code:
if [ "${TEXT#*[![:alnum:] .,;:'"/\()_+=~@&*-]}" = "$TEXT" ];then echo 'no non-printables found';else echo 'non-printable found';fi

Code:
 
+ TEXT='This is a sample text with supposedly non-printable character^Y.'
./test.ksh: line 5: syntax error at line 47: `'' unmatched

Then I introduced a \ before '

Code:
if [ "${TEXT#*[![:alnum:] .,;:\'"/\()_+=~@&*-]}" = "$TEXT" ];then echo 'no non-printables found';else echo 'non-printable found';fi

then the error was -
Code:
+ TEXT='This is a sample text with supposedly non-printable character^Y.'
./test.ksh: line 5: syntax error at line 47: `{' unmatched

then I I introduced a \ before "
Code:
if [ "${TEXT#*[![:alnum:] .,;:\'\"/\()_+=~@&*-]}" = "$TEXT" ];then echo 'no non-printables found';else echo 'non-printable found';fi

error was -
Code:
 
+ TEXT='This is a sample text with supposedly non-printable character^Y.'
./test.ksh: line 5: syntax error at line 47: `)' unexpected

Then finally I introduced \ before all the shell special characters -
Code:
 
echo ${TEXT#*[![:alnum:] .,;:\'\"/\(\)\_+=~@&\*-]}

which resulted into
Code:
Y.

But I think that's wrong because it should have resulted ^ as that's the only punctuation mark not included in the list?! Smilie

-dips
# 9  
Old 02-12-2015
Quote:
Originally Posted by dips_ag
Hi Don,

I think there is some problem with this syntax.

test.ksh
Code:
if [ "${TEXT#*[![:alnum:] .,;:'"/\()_+=~@&*-]}" = "$TEXT" ];then echo 'no non-printables found';else echo 'non-printable found';fi

Code:
 
+ TEXT='This is a sample text with supposedly non-printable character^Y.'
./test.ksh: line 5: syntax error at line 47: `'' unmatched

Then I introduced a \ before '

Code:
if [ "${TEXT#*[![:alnum:] .,;:\'"/\()_+=~@&*-]}" = "$TEXT" ];then echo 'no non-printables found';else echo 'non-printable found';fi

then the error was -
Code:
+ TEXT='This is a sample text with supposedly non-printable character^Y.'
./test.ksh: line 5: syntax error at line 47: `{' unmatched

then I I introduced a \ before "
Code:
if [ "${TEXT#*[![:alnum:] .,;:\'\"/\()_+=~@&*-]}" = "$TEXT" ];then echo 'no non-printables found';else echo 'non-printable found';fi

error was -
Code:
 
+ TEXT='This is a sample text with supposedly non-printable character^Y.'
./test.ksh: line 5: syntax error at line 47: `)' unexpected

Then finally I introduced \ before all the shell special characters -
Code:
 
echo ${TEXT#*[![:alnum:] .,;:\'\"/\(\)\_+=~@&\*-]}

which resulted into
Code:
Y.

But I think that's wrong because it should have resulted ^ as that's the only punctuation mark not included in the list?! Smilie

-dips
I apologize for not trying this out before I posted it.

In a BRE or an ERE special RE characters lose their special meaning when inside a bracket expression, but that is not true in a shell pattern matching expression. I'm glad you were able to figure out what was needed to make it work for you. Even here, the underscore and the asterisk do not need to be escaped.

The output from:
Code:
echo ${TEXT#*[![:alnum:] .,;:\'\"/\(\)_+=~@&*-]}

is correct. The * matched (and discarded):
Code:
This is a sample text with supposedly non-printable character

and the:
Code:
[![:alnum:] .,;:\'\"/\(\)\_+=~@&\*-]

matched and discarded the ^ just leaving
Code:
Y.

in that expansion of $TEXT. The whole point of that expansion is to find a remove one character that is not in the set of characters that you are declaring to be "non-printable" with the non-matching bracket expression. The the if statement comparing the original string and the original string with a non-printable character removed compare equal if and only there are no non-printable characters in the string.
# 10  
Old 02-13-2015
Thank you so much Don for explaining in detail! Despite that I've one more doubt (please bear with me!)

Quote:
Originally Posted by Don Cragun

The output from:
Code:
echo ${TEXT#*[![:alnum:] .,;:\'\"/\(\)_+=~@&*-]}

is correct. The * matched (and discarded):
Code:
This is a sample text with supposedly non-printable character

and the:
Code:
[![:alnum:] .,;:\'\"/\(\)\_+=~@&\*-]

matched and discarded the ^ just leaving
Code:
Y.

in that expansion of $TEXT. The whole point of that expansion is to find a remove one character that is not in the set of characters that you are declaring to be "non-printable" with the non-matching bracket expression. The the if statement comparing the original string and the original string with a non-printable character removed compare equal if and only there are no non-printable characters in the string.
But Y is an alphabet so wouldn't [:alnum:] matches that? and a dot . is already present in the list of allowable punctuations?

-dips
# 11  
Old 02-13-2015
Quote:
Originally Posted by dips_ag
Thank you so much Don for explaining in detail! Despite that I've one more doubt (please bear with me!)



But Y is an alphabet so wouldn't [:alnum:] matches that? and a dot . is already present in the list of allowable punctuations?

-dips
The first character in the bracket expression ([![:alnum:] .,;:\'\"/\(\)_+=~@&*-]) is ! so this is a NON-matching bracket expression. This bracket expression matches any single character that is NOT alphanumeric, NOT a <space>, NOT a <period>, NOT a <comma>, NOT a <semicolon>, NOT a <colon>, NOT a <single-quote>, NOT a <double-quote>, NOT a <slash>, NOT an <open-parenthesis>, NOT a <closing-parenthesis>, NOT an <underscore>, NOT a <plus_sign>, NOT an <equal-sign>, NOT a <tilde>, NOT an <at-sign>, NOT an <ampersand>, NOT an <asterisk>, and NOT a <hyphen-dash> (in this case it matches the circumflex). So, if there is a string of characters starting with any zero or more characters followed by one character in that non-matching expression, the ${var#expression} will expand to the contents of the variable var with the string up to and including the first character that matches the non-matching expression removed.

If there aren't any characters in the variable that match the non-matching expression, there is no match for the entire expression; so the variable is expanded without removing anything. And, if ${var#expression} expands to the same thing as $var, we know that no character was found in $var that you consider to be non-printable.
This User Gave Thanks to Don Cragun For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Redirecting records with non-printable characters

Hi, I have a huge file (50 Mil rows) which has certain non-printable ASCII characters in it. I am cleaning the file by deleting those characters using the following command - tr -cd '\11\12\15\40-\176' < unclean_file > clean_file Please note that I am excluding the following - tab,... (6 Replies)
Discussion started by: rishigc
6 Replies

2. Shell Programming and Scripting

Unable to grep control/non printable characters

Unable to grep: Able to grep: (11 Replies)
Discussion started by: proactiveaditya
11 Replies

3. UNIX for Dummies Questions & Answers

removing non printable characters

Hi, in a file, i have records as below: 123|62|absnb|267629 123|267|28728|uiuip 123|567|26761|2676 i want to remove the non printable characters after the end of each record. I guess there are certain charcters but not visible. i don't know what character that is exactly. I used... (2 Replies)
Discussion started by: pandeesh
2 Replies

4. Shell Programming and Scripting

Removing Non-printable characters in unix file

Hi, We have a non printable character "®" in our file , we want to remove this character, we tried tr -dc '' < oldfile> newfile but this command is removing all new line entries along with the non printable character and all the records are coming in one line(it is changing the format of the... (2 Replies)
Discussion started by: pyaranoid
2 Replies

5. Shell Programming and Scripting

Check whether there is a non printable character in the unix variables

cp $l_options $srcdirfile $destdirfile If i want to check whether there is a non printable character in the variables $l_options $srcdirfile $destdirfile how it can be done? (2 Replies)
Discussion started by: lalitpct
2 Replies

6. HP-UX

Non-printable characters

I have been using OKI data Microline printers; models 590 and 591 to print a bar code using the following escape sequence: \E^PA^H^C00^D^C^A^A^A\E^PB^H The escape sequence is stored in a unix file which is edited using vi. Now, we are considering Microline printer model 395C and the bar code... (3 Replies)
Discussion started by: Joy Conner
3 Replies

7. UNIX for Dummies Questions & Answers

delete non printable characters from file

i have a file which contains non printable characters like enter,escape etc i want to delete them from the file (2 Replies)
Discussion started by: alokjyotibal
2 Replies

8. Shell Programming and Scripting

grep non printable characters

Sometimes obvious things... are not so obvious. I always thought that it was possible to grep non printable characters but not with my GNU grep (5.2.1) version. printf "Hello\tWorld" | grep -l '\t' printf "Hello\tWorld" | grep -l '\x09' printf "Hello\tWorld" | grep -l '\x{09}' None of them... (3 Replies)
Discussion started by: ripat
3 Replies

9. UNIX for Dummies Questions & Answers

Ksh Checking if string has 2 characters and does not contain digits?

How could I check if a string variable contains at least (or only) 2 characters, and check and make sure that the string does not contain any numeric digits?...I need to know how to do this as simple as possible. and I am using the Ksh shell. Thanks. (1 Reply)
Discussion started by: developncode
1 Replies

10. Shell Programming and Scripting

Best way to search files for non-printable characters?

I need to check ftp'd incoming files for characters that are not alphanumeric,<tab>, <cr>, or <lf> characters. Each file would have 10-20,000 line with up to 3,000 characters per line. Should I use awk, sed, or grep and what would the command look like to do such a search? Thanks much to anyone... (2 Replies)
Discussion started by: jvander
2 Replies
Login or Register to Ask a Question