Help in backreferences


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help in backreferences
# 1  
Old 08-27-2014
Help in backreferences

# cat file
Code:
Hello world
Hello earth
"Hello" world hello world 'hello'
hello "world" 
hello 'world'
hello hello
hello hello hello

Target :- For more than one occurence of hello[Below code works fine]

Code:
 # grep "\(hello\).*\1" file
"Hello" world hello world 'hello'
hello hello
hello hello hello

Target:- match single or double-quorted words [Doesnot work fine]
Code:
 # grep "\(["']\).*\1" file
>

Request to throw some light on this error
# 2  
Old 08-27-2014
The back reference in grep an only be used as part of the matching. So the first expression would look for a string hello.*hello
The second expression would look if a string between quotes is present on a line and then print that line.

This cannot be done with single grep (it can be done with two consecutive "grep -o" 's )
Code:
grep -o "[\"'][^\"']*[\"']" file | grep -Eo "[^\"']+"

A back reference has no use here since it cannot be used inside a [] bracket expression


It can be done with a single awk:
Code:
awk -F"['\"]" '{for(i=2; i<=NF; i+=2) print $i}' file

However this would get more complicated if we allow for single quotes inside double quotes or double quotes inside single quotes. Then you would need something like:

Code:
grep -Eo '"[^"]*"|'"'[^']*'" file | sed 's/^.//; s/.$//'


Last edited by Scrutinizer; 08-27-2014 at 11:44 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 08-27-2014
Quote:
Originally Posted by shellscripting
Code:
 # grep "\(["']\).*\1" file
>

Request to throw some light on this error
Code:
# grep "\([\"']\).*\1" file

This User Gave Thanks to junior-helper For This Post:
# 4  
Old 08-27-2014
Quote:
Originally Posted by junior-helper
Code:
# grep "\([\"']\).*\1" file

Yes, but that would match the longest match on a line (greedy match) so it would not work if the are multiple occurrences on a line with the same kind of quotes.
Also grep would return the line and not what was matched, you would need grep -o for that..

Code:
$ echo "\"Hello\" foo bar \"hello\" 'hello'" | grep "\([\"']\).*\1"
"Hello" foo bar "hello" 'hello'
$ echo "\"Hello\" foo bar \"hello\" 'hello'" | grep -o "\([\"']\).*\1"
"Hello" foo bar "hello"
'hello'

You could use
Code:
grep -o "\([\"']\)[^\"']*\1"

but then it would not work for single quotes inside double quotes, or vice versa. And you need to remove the quotes with a pipeline and a second command..

Last edited by Scrutinizer; 08-27-2014 at 10:27 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 08-27-2014
You're right. I tested only on provided sample input (incl. -o) and did not think further Smilie
# 6  
Old 08-27-2014
Thank you both for explaining it with examples
 
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question