Using grep with hyphens


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using grep with hyphens
# 1  
Old 01-31-2013
Using grep with hyphens

This is on a RHEL 6 box with bash 4.1.2

I'm trying to to use grep to only find those lines containing matches that form whole words.

The -w option works fantastic unless of course that word has a hyphen.

The problem is I will get a hit on "test-group" which is a good thing, but I will also get a hist on "test" which is bad because the group test doesn't exist. It appears that once grep hits a hyphen it treats the preceding text as a whole word.

Any ideas would be greatly appreciated.

Next time no groups with hyphens..

Thanks
# 2  
Old 01-31-2013
From grep manual page:
Code:
-w, --word-regexp
Select only those lines containing matches that form whole words.  The test is that the matching substring must either be at  the  beginning  of
the  line,  or preceded by a non-word constituent character.  Similarly, it must be either at the end of the line or followed by a non-word con-
stituent character.  Word-constituent characters are letters, digits, and the underscore.

You can use an awk code instead:
Code:
awk '{ for(i=1;i<=NF;i++) if($i=="test") print $i; }' filename

# 3  
Old 01-31-2013
You might tr the hyphens to underscores (and vice versa?) on a pipe to grep optioned to print out the line numbers and fetch those lines from the original file.
# 4  
Old 02-02-2013
Hi.

A more complex awk solution:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate a "grep -wo" for hyphenated words.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C gawk

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " AWK code for grepw:"
p=./grepw
cat $p

pl " Results:"

pe
pe " Version of code:"
$p --version

pl " Print any word containing \"foo\" including hyphenated words:"
$p foo $FILE

pl " Matching hyphenated word foo-bar:"
$p foo-bar $FILE

pl " Matching toe-nail:"
$p toe-nail $FILE

pl " Matching toe-nail|finishing-nail (an OR):"
$p "toe-nail|finishing-nail" $FILE

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
gawk GNU Awk 3.1.5

-----
 Input data file data1:
foo
bar
xyzzy
thud
foo-bar
The foo-bar came from a WWII term.
1234567890
You should toe-nail the studs, but don't use finishing-nails.

-----
 AWK code for grepw:
#!/usr/bin/env bash

# @(#) grepw	Search for words, grep-like, but also allow hyphens.

p=$( basename $0 ) t1="$Revision: 1.10 $" v=${t1//[!0-9.]/}
[[ $# > 0 ]] && [[ "$1" =~ -version ]] &&  { echo "$p $v" ; exit 0 ; }

if [ $# -le 0 ]
then
  pe "$p: must supply pattern -- $0" >&2
  exit 1
fi

PATTERN="$1"
shift

# Loop adapted from post at:
# http://stackoverflow.com/questions/1116193/
# awk-extract-multiple-groups-from-each-line

# ORs with "|" will work, but ANDs with ".*" will not work unless
# the string to match .* has no spaces in it.

gawk -v pattern="$PATTERN" '
	{ 
  from = 0
  i = 1
  pos = match( $0, /\<[-a-zA-Z0-9_]+\>/, val )
  while( 0 < pos )
  {
    # print val[0] in any form you desire.
    if ( val[0] ~ pattern ) printf(" Line %3d, token %2d, %s\n",
    NR, i, val[0] )
    from += pos + val[0, "length"]
    pos = match( substr( $0, from ), /\<[-a-zA-Z0-9_]+\>/, val )
    i++
  }

	}
' $*

exit 0

-----
 Results:

 Version of code:
grepw 1.10

-----
 Print any word containing "foo" including hyphenated words:
 Line   1, token  1, foo
 Line   5, token  1, foo-bar
 Line   6, token  2, foo-bar

-----
 Matching hyphenated word foo-bar:
 Line   5, token  1, foo-bar
 Line   6, token  2, foo-bar

-----
 Matching toe-nail:
 Line   8, token  3, toe-nail

-----
 Matching toe-nail|finishing-nail (an OR):
 Line   8, token  3, toe-nail
 Line   8, token 10, finishing-nails

See man pages for details.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script in Perl or awk to remove multiple hyphens

Dear all, I have a database of compound words. I want to retain only strings with a single hyphen and identify those strings which have more than one hyphen. I am giving an example below test-test test-test-test test-test-test-test-test good-for-nothing The regex/script should remove all... (11 Replies)
Discussion started by: gimley
11 Replies

2. Shell Programming and Scripting

Inconsistent `ps -eaf -o args | grep -i sfs_pcard_load_file.ksh | grep -v grep | wc -l`

i have this line of code that looks for the same file if it is currently running and returns the count. `ps -eaf -o args | grep -i sfs_pcard_load_file.ksh | grep -v grep | wc -l` basically it is assigned to a variable ISRUNNING=`ps -eaf -o args | grep -i sfs_pcard_load_file.ksh |... (6 Replies)
Discussion started by: wtolentino
6 Replies

3. UNIX for Dummies Questions & Answers

Piping grep into awk, read the next line using grep

Hi, I have a number of files containing the information below. """"" Fundallinfo 6.3950 14.9715 14.0482 """"" I would like to grep for Fundallinfo and use it to read the next line? I ideally would like to read the three numbers that follow in the next line and... (2 Replies)
Discussion started by: Paul Moghadam
2 Replies

4. What is on Your Mind?

Filenames with hyphens - UNIX style?

Hello everyone! Filenames with hyphens instead of everything else that can be as a space - is it particularly UNIX style of naming or a general practice? It kinda is so in my mind that DOS guys use underscores as spaces and UNIX guys use dashes. Is it so? (5 Replies)
Discussion started by: guest115
5 Replies

5. UNIX for Dummies Questions & Answers

Bash - CLI - grep - Passing result to grep through pipe

Hello. I want to get all modules which are loaded and which name are exactly 2 characters long and not more than 2 characters and begin with "nv" lsmod | (e)grep '^nv???????????? I want to get all modules which are loaded and which name begin with "nv" and are 2 to 7 characters long ... (1 Reply)
Discussion started by: jcdole
1 Replies

6. Shell Programming and Scripting

Renaming mutiple files with hyphens in name

I have searched throught a host of threads to figure out how to rename mutiple files at once using a script. I need to convert 200+ files from: fKITLS_120605-0002-00001-000001.hdr to eStroop_001.hdr fKITLS_120605-0002-00002-000002.hdr to eStroop_002.hdr and so forth.... What is... (5 Replies)
Discussion started by: akenne3
5 Replies

7. Shell Programming and Scripting

grep for certain files using a file as input to grep and then move

Hi All, I need to grep few files which has words like the below in the file name , which i want to put it in a file and and grep for the files which contain these names and move it to a new directory , full file name -C20091210.1000-20091210.1100_SMGBSC3:1000... (2 Replies)
Discussion started by: anita07
2 Replies

8. UNIX for Dummies Questions & Answers

| help | unix | grep (GNU grep) 2.5.1 | advanced regex syntax

Hello, I'm working on unix with grep (GNU grep) 2.5.1. I'm going through some of the newer regex syntax using Regular Expression Reference - Advanced Syntax a guide. ls -aLl /bin | grep "\(x\)" Which works, just highlights 'x' where ever, when ever. I'm trying to to get (?:) to work but... (4 Replies)
Discussion started by: MykC
4 Replies

9. UNIX for Dummies Questions & Answers

| help | unix | grep - Can I use grep to return a string with exactly n matches?

Hello, I looking to use grep to return a string with exactly n matches. I'm building off this: ls -aLl /bin | grep '^.\{9\}x' | tr -s ' ' -rwxr-xr-x 1 root root 632816 Nov 25 2008 vi -rwxr-xr-x 1 root root 632816 Nov 25 2008 view -rwxr-xr-x 1 root root 16008 May 25 2008... (7 Replies)
Discussion started by: MykC
7 Replies

10. Shell Programming and Scripting

MEM=`ps v $PPID| grep -i db2 | grep -v grep| awk '{ if ( $7 ~ " " ) { print 0 } else

Hi Guys, I need to set the value of $7 to zero in case $7 is NULL. I've tried the below command but doesn't work. Any ideas. thanks guys. MEM=`ps v $PPID| grep -i db2 | grep -v grep| awk '{ if ( $7 ~ " " ) { print 0 } else { print $7}}' ` Harby. (4 Replies)
Discussion started by: hariza
4 Replies
Login or Register to Ask a Question