ksh pattern matching

ksh pattern matching

I try to use the pattern substitution operators as documented in the O'Reilley "Learning the Korn Shell" but it doesn't seem to work as advertised.

This works all right:
var='Regular expressions rules!'
$ echo ${var//e/#}
R#gular #xpr#ssions rul#s!

The docs says that using !(expr) matches anything that does't match expr but if I try to replace all but the "e" character, it does not seem to work:
var='Regular expressions rules!'
$ echo ${var//!(e)/#}

Any idea?

The newer shells, ksh and bash, have a lot of syntactical elements that are easily confused with one another.

The "Pattern Substitution Operators" syntax:

can have a number of substitution operations with #, %, etc. They use the meta-characters, *, [], and ? -- page 123 ff, Learning the Korn Shell, 2nd Edition ("LTKS").

The "Patterns and Regular Expression" syntax uses:
*(exp), ?(exp), !(exp) ...

which correspond to the usual syntax we find in grep, etc:
grep "e*" ...

These patterns could be used within double brackets, for example:
if [[ $var == *!(e)* ]]

but not with string operator syntax (as far as I know) -- page 113 ff, 144 ff.

The ksh I use (pdksh, even on Solaris) notes a bad substitution for what I think is the right thing, but bash does it correctly in my opinion. Here's an example:
#!/bin/bash -
#!/bin/ksh -

# @(#) s1       Demonstrate string operators.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1)

var='String operators rule!'
echo " Replace e with _:"
echo ${var//e/_}

echo " Replace everything except e with _:"
echo ${var//[^e]/_}

exit 0

% ./s1
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0

 Replace e with _:
String op_rators rul_!

 Replace everything except e with _:

Perhaps someone will stop by with a better explanation or a better suggestion ... cheers, drl
Originally Posted by drl
The "Patterns and Regular Expression" syntax uses:
*(exp), ?(exp), !(exp) ...

(...) These patterns could be used within double brackets, for example (...) but not with string operator syntax (as far as I know) -- page 113 ff, 144 ff.
They do partially work in my ksh version (1993-12-28 r):

var='jo mike and dave are good friends'

$ echo ${var//a?(re)/_}
# returns > jo mike _nd d_ve _ good friends

$ echo ${var//g*(o)/_}
#returns > jo mike and dave are _d friends

$ echo ${var//+(o)/_}
#returns > j_ mike and dave are g_d friends

$ echo ${var//@(jo|dave)/_}
#returns > _ mike and _ are good friends

All returns as expected but I try to use the !(exp) like the PCRE look behind assertions (?<=exp). Still trying...
It matters which version of ksh you are using. ksh93 has the // syntax while ksh88 does not. I am not sure about pdksh. On Solaris, dtksh is a souped vesion of ksh93. With dtksh...
$ /usr/dt/bin/dtksh
$ set -o emacs
$ x=hello
$ echo ${x//l/X} ${x//[!l]/X}
heXXo XXllX

I assume you are talking about section 4.3 (String Operators) of LTKS


echo ${.sh.version}
var='A regular expressions test'

echo "1>  //e/#"
echo ${var//e/#}
echo "2>  //[^e]/#"
echo ${var//[^e]/#}
echo "3>  //+(e)/#"
echo ${var//+(e)/#}
echo "4>  //-(e)/#"
echo ${var//-(e)/#}
echo "5>  //?(e)/#"
echo ${var//?(e)/#}
echo "6>  //*(e)/#"
echo ${var//*(e)/#}
echo "7>  //!(e)/#"
echo ${var//!(e)/#}

Gives the following output

Version M 1993-12-28 s+
1>  //e/#
A r#gular #xpr#ssions t#st
2>  //[^e]/#
3>  //+(e)/#
A r#gular #xpr#ssions t#st
4>  //-(e)/#
A regular expressions test
5>  //?(e)/#
6>  //*(e)/#
7>  //!(e)/#

Interesting! I am not sure what is going on.

That's quite an array of very different results. If I were aiming for portability (which I usually am), I'd probably use the old standby sed:
#!/usr/bin/ksh -
#!/usr/xpg4/bin/sh -
#!/bin/ksh -
#!/bin/bash -

# @(#) s2       Demonstrate pattern matching in dtksh and sed.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1)

var='Regular expressions rules!'
echo " Replace e with _:"
echo ${var//e/_}

echo " Replace everything except e with _:"
echo re ${var//!(e)/_}
echo fe ${var//[!e]/_}

echo " Replace everything except e with _ using sed:"
echo "$var" | sed -e 's|[^e]|_|g'

exit 0

$ ./s2
(Versions displayed with local utility "version")
SunOS 5.10
dtksh M-12/28/93d

 Replace e with _:
R_gular _xpr_ssions rul_s!

 Replace everything except e with _:
re _
fe _e______e___e__________e__

 Replace everything except e with _ using sed:

The re and fe above are regular expressions and filename expressions, intended to show the different syntax, and how one works and the other does not.

I got something out of this, namely dtksh, thanks to Perderabo. It's a bit tricky to find. I think it also carries a very large load of graphical baggage -- sort of like Tk (tcl/Tk). I dug through my very old pile of books, finding this:
Title: Desktop KornShell Graphical Programming
Author: Pendergrast, Jr., J Stephen
Edition: 1st
Date: 1995
Publisher: Addison-Wesley Pub (Sd)
ISBN: 0-201-63375-2
Pages: 880
Categories: ksh, Korn Shell, scripting, unix, shell, programming, graphics
Comments: 4 stars ( Amazon, 4 reviews )
and at almost 1,000 pages, you can tell there is a lot. The size on Solaris X86 is 620144, even larger than bash.

You pays your money and you takes your chances (quoting either the cartoon character Popeye or one of my previous bosses Smilie ) ... cheers, drl
