Visit Our UNIX and Linux User Community


Match string against character class in bash


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match string against character class in bash
# 1  
Old 03-07-2014
Match string against character class in bash

Hello,
I want to check whether string has only numeric characters. The following code doesn't work for me
Code:
#!/usr/local/bin/bash
if [[ $1 =~ [:digit:] ]]; then
echo "true"
else
echo "False"
fi

Code:
[root@freegtw /data/termit]# ./yyy '346'
False
[root@freegtw /data/termit]# ./yyy 'aaa'
False

I'm searching for solution using character classes, not regex. Thanks in advance.
# 2  
Old 03-07-2014
Code:
REGEX="^[[:digit:]]*$"
if [[ $1 =~ $REGEX ]]
then
    echo yes
else
    echo no
fi

OR for portability:
Code:
if echo $1 | grep -q "^[0-9]*$"
then
    echo yes
else
    echo no
fi

# 3  
Old 03-07-2014
Code:
grep -q "^[0-9][0-9]*$"

Code:
 awk -v var="${1}" 'BEGIN {if(var * 1 == var) {print "numeric"} else {print "non-numeric"}}

---------- Post updated at 06:41 AM ---------- Previous update was at 06:14 AM ----------

Quote:
Originally Posted by balajesuri
Code:
REGEX="^[[:digit:]]*$"
if [[ $1 =~ $REGEX ]]
then
    echo yes
else
    echo no
fi

OR for portability:
Code:
if echo $1 | grep -q "^[0-9]*$"
then
    echo yes
else
    echo no
fi

Hi Balaji, I guess, this will fail if ${1}=''

Last edited by SriniShoo; 03-07-2014 at 06:39 AM.. Reason: alternative
# 4  
Old 03-07-2014
Quote:
Originally Posted by balajesuri
Code:
REGEX="^[[:digit:]]*$"
if [[ $1 =~ $REGEX ]]
then
    echo yes
else
    echo no
fi

Thanks a lot. It works for me
# 5  
Old 03-07-2014
Quote:
Originally Posted by SriniShoo
Hi Balaji, I guess, this will fail if ${1}=''
True, there are no validation checks. It was just meant for the OP to get started :-)
# 6  
Old 03-07-2014
The simplest solution is to check for the presence of a character which does not match the class in question. In other words, negate the class: [^[:digit:]].

Regards,
Alister

---------- Post updated at 05:24 PM ---------- Previous update was at 11:56 AM ----------

Quote:
Originally Posted by SriniShoo
Code:
grep -q "^[0-9][0-9]*$"

The range expression [0-9] is only defined in the C/POSIX locale. If the solution only needs to function in that locale, it's still a good idea to set it explicitly in the command's environment, e.g. LC_COLLATE=C grep .... Aternatively, you can leave the locale unspecified, and explicitly enumerate each digit, for a cross-locale portable solution: [0123456789]. If the digits do not need to be so rigidly defined, then it's simplest to use the character class, [[:digits:]].

Quote:
Originally Posted by SriniShoo
Code:
 awk -v var="${1}" 'BEGIN {if(var * 1 == var) {print "numeric"} else {print "non-numeric"}}

This, in my opinion, is a terrible solution because it depends on a great deal of subtle behavior and because it mistakenly assumes that -v can assign arbitrary text. Even an expert AWK hacker probably cannot say with certainty how that will behave across implementations.

There are always some ambiguities in the standards and there are always some disparities between implementations. Your awk one-liner, unfortunately, resides in those grey areas.

One thing that the standard is clear on is that the right side of command line assignments, value in name=value is parsed as a string token.

POSIX states that a -v option argument, name=value in -v name=value, must take the form of an assignment operand, but says nothing about its behavior, aside from when it takes effect (before even a BEGIN section). It seems reasonable to assume that implementors will treat them as string tokens as well.

Parsing AWK string tokens involves escape sequence processing.

In short, there is no way to naively pass arbitrary text into awk using command line assignments (with or without the -v option).

For more details, refer to the OPTIONS and OPERANDS sections near the beginning of the POSIX AWK man page.

The following script feeds three strings to your awk code. None of those strings is numeric -- each one contains a backslash and a letter -- yet your code will return "numeric" in most cases.

In the following, original-awk is nawk.

isnumeric.sh:
Code:
for x in  '123\f' '123\t' '123\n'; do
	printf '\nTesting %s ...\n' "$x"
	for awk in gawk mawk original-awk; do
		printf '%s: ' $awk
		$awk -v var="${x}" 'BEGIN {if (var * 1 == var) {print "numeric"} else {print "non-numeric"}}'

	done
done

Produces:
Code:
$ sh isnumeric.sh

Testing 123\f ...
gawk: numeric
mawk: numeric
original-awk: non-numeric

Testing 123\t ...
gawk: numeric
mawk: numeric
original-awk: numeric

Testing 123\n ...
gawk: numeric
mawk: non-numeric
original-awk: numeric

The above should make it clear that your awk suggestion cannot handle arbitrary text. Note that not only do the implementations disagree, but that they do so inconsistently.

The results are also locale dependent, because converting text to a numeric involves stripping leading/trailing blanks, and membership in the blank class is locale dependent.

In the C/POSIX locale, of \f, \t, and \n, only \t is a member of [[:blank:]]. The correct result should be: 123\f => non-numeric, 123\t => numeric, 123\n => non-numeric. In my testing, gawk was worst with 1 of 3 correct. mawk and nawk tied with 2 of 3 correct.

If you wanted to use AWK for this, I would recommend reading the text on stdin instead of from the command line. I would also recommend using a regular expression match operation instead of mulitple implicit type conversions.

Unrelated tangent: For a reason that I cannot fathom, ubuntu 12.04 LTS installs nawk as /usr/bin/original-awk while /usr/bin/nawk is left as a symlink to /usr/bin/gawk (via /etc/alternatives/nawk). Before installing gawk, nawk pointed to /usr/bin/mawk (again, via /etc/alternatives/nawk). If that's normal, I'm at a loss for words. I hope, for the sake of Ubuntu userland sanity, that this is just an aberration confined to this particular install.

Regards,
Alister
This User Gave Thanks to alister For This Post:

Previous Thread | Next Thread
Test Your Knowledge in Computers #155
Difficulty: Easy
Defense Advanced Research Projects Agency (DARPA) initiated the ARPANET, the precursor to the Internet, in 1969.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regex issue with \s in character class.

Anybody have an explanation for why \s doesn't match ' ' in a character class? Here are 3 examples with the final example showing that \s in a character class (demonstrated by using egrep -o) fails: \s works outside of class.. # echo " FOO " | egrep -o '\s+\s' FOO Here is a... (6 Replies)
Discussion started by: blackrageous
6 Replies

2. UNIX for Beginners Questions & Answers

Escape bash-special character in a bash string

Hi, I am new in bash scripting. In my work, I provide support to several users and when I connect to their computers I use the same admin and password, so I am trying to create a script that will only ask me for the IP address and then connect to the computer without having me to type the user... (5 Replies)
Discussion started by: arcoa05
5 Replies

3. Shell Programming and Scripting

Bash - Inserting non printable character(s) in string variable

Hello. I have a string variable named L_TEMP to test a very simple filter. L_TEMP="50AwL.|KWp9jk" I want to insert a non printable character between K and W. I have try this : linux-g65k:~ # a='50AwL.|K' linux-g65k:~ # b='Wp9jk' linux-g65k:~ # L_TEMP="$a$'\x07'$b" linux-g65k:~ # echo... (6 Replies)
Discussion started by: jcdole
6 Replies

4. Shell Programming and Scripting

Bash: Pulling first and last character in string

I am writing a bash script that will find all references to the “Well_List” in the “Comp_File”. I am filtering a Well_List that contains the following: TEST_WELL_01 TEST_WELL_02 TEST_WELL_11 TEST_WELL_22 GOV_WELL_1 GOV_WELL_201 PUB_WELL_57 PUB_WELL_82 . . Comparison... (5 Replies)
Discussion started by: petfyp
5 Replies

5. Shell Programming and Scripting

How can I match the particular character in the string?

Hi, I want to check out a word in the text file and generate a clear report for me to see... The text file content: Content: ............ 20120608: 20120608: ............ 20120608: .......... 2012031201: , hime] End of the file My expected output is: Full TXT: manatsu TXT:... (3 Replies)
Discussion started by: meroko
3 Replies

6. Shell Programming and Scripting

bash script search file and insert character when match found

Hi I need a bash script that can search through a text file and when it finds 'FSS1206' I need to put a Letter F 100 spaces after the second instance of FSS1206 The format is the same throughout the file I need to repeat this on every time it finds the second 'FSS1206' in the file I have... (0 Replies)
Discussion started by: firefox2k2
0 Replies

7. Shell Programming and Scripting

Bash: How to remove the last character of a string?

In bash, how can one remove the last character of a string? In perl, the chop function would remove the last character. However, I do not know how to do the same job in bash. Many thanks in advance. (12 Replies)
Discussion started by: LessNux
12 Replies

8. Shell Programming and Scripting

Bash - get specific character from the string

Hi! If I want to extract a character from a specific position of a string, I can use ${string:1:1} (if I want character at the position 1). How can I do the same thing, when the number of position is contained in the variable? ${string:$var:1}doesn't work, unfortunately. Thanks in advance. (2 Replies)
Discussion started by: xqwzts
2 Replies

9. Shell Programming and Scripting

awk and POSIX character class

can anyone tell me why this doesn't work? I've been trying to play with character classes and I seem to be missing something here..! echo "./comparecdna.summary" | awk '/^compare+]summary$/' # returns nothing echo "./compare_cdna.summary" | awk '/^compare_+]summary$/' # returns nothing echo... (5 Replies)
Discussion started by: anthalamus
5 Replies

10. Shell Programming and Scripting

bash script to check the first character in string

Hello would appreciate if somebody can post a bash script that checks if the first character of the given string is equal to, say, "a" thnx in advance (2 Replies)
Discussion started by: ole111
2 Replies

Featured Tech Videos