sed match


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed match
# 8  
Old 05-06-2016
The .* is greedy, that means it consumes as many characters as possible - while the other conditions are still met.
If there are two .* then the leftmost is most greedy.

Last edited by MadeInGermany; 05-06-2016 at 10:32 AM..
These 2 Users Gave Thanks to MadeInGermany For This Post:
# 9  
Old 05-06-2016
Hi, in the example below its printing less than expected, not more:

HTML Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/.*\(-.*-\).*/\1/'
output
-010516 and 050516-

If it was being greedy wouldnt it print -range on 5th May is between -010516 and 050516- ?

Im obviously confused with this Smilie
# 10  
Old 05-06-2016
Greedy means earlier parts of the regex "win", they will match as far as they can. They'll only ever give it up when the rest of the expression fails to match. So the first .* matches all the way to the end, stealing the entire expression if it can get away with it, and backtracking when it can't.
Code:
.*\(-.*-).*

Code:
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue
Date-range on 5th May is between -010516 and 050516- please continue


Last edited by Corona688; 05-06-2016 at 02:23 PM..
These 2 Users Gave Thanks to Corona688 For This Post:
# 11  
Old 05-06-2016
Quote:
Originally Posted by andy391791
Hi Don, thankyou very much for your time in explaining that to me.

However, after doing a bit of testing im still slightly confused and would like to understand :


"The -.*- between the parens in that RE will match the 1st - on the line ( - in the RE), everything after the 1st - unto but not including the
last - on the line ( .* in the RE), and the last - on the line ( - in the RE)."


Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/.*\(-.*-\).*/\1/'

output
-010516 and 050516-

Why is this not -range on 5th May is between -010516 and 050516- ?

Code:
echo "Date range on 5th May is between -010516 and 050516- please-continue "| sed 's/.*\(-.*-\).*/\1/'

output
- please-

Why is this not -010516 and 050516- please- ?


For the second statement:

"The RE between parentheses in this sed substitute command ( -[^-]*- ) matches the 1st - on the line ( - in the RE),
the longest string of characters available that does not include a - ( [^-]* in the RE), and the 2nd - on the line ( - in the RE)."

Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/.*\(-[^-]*-\).*/\1/'

output
-010516 and 050516-

Why is this not -range on 5th May is between - ?

Code:
echo "Date range on 5th May is between -010516 and 050516- please-continue "| sed 's/.*\(-[^-]*-\).*/\1/'

output
- please-

Why is this not -010516 and 050516- ?

Thanks again
I sincerely apologize. I haven't been getting enough sleep lately.

It looks like Corona688 and MadeInGermany have mostly cleaned up my mess. I sincerely thank both of them for correcting my earlier misinformation.

If you want to print everything between the 1st and last dashes on a line (including the dashes), you need something like:
Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/[^-]*\(-.*-\).*/\1/'

producing the output:
Code:
-range on 5th May is between -010516 and 050516-

If you want to print everything between the 1st and last dashes on a line (not printing the 1st and last dashes), you need something like:
Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/[^-]*-\(.*\)-.*/\1/'

producing the output:
Code:
range on 5th May is between -010516 and 050516

If you want to print everything between the 1st two dashes on a line (printing those dashes), you need something like:
Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/[^-]*\(-[^-]*-\).*/\1/'

producing the output:
Code:
-range on 5th May is between -

And, if you want to print everything between the 1st two dashes on a line (not printing those dashes), you need something like:
Code:
echo "Date-range on 5th May is between -010516 and 050516- please continue "| sed 's/[^-]*-\([^-]*\)-.*/\1/'

producing the output:
Code:
range on 5th May is between

Note that in all of these substitution BREs, the expression before the parenthesized expression we will print starts with [^-]* which will greedily gobble up as many non-dash characters as it can find (but not any dashes).
This User Gave Thanks to Don Cragun For This Post:
# 12  
Old 05-09-2016
Don, absolutely no need to apologize ;many thanks to you and the other posts for taking the time explaining this to me, it now makes sense !
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Getting out of sed on first NOT match

i have a large file where i want to look for any record that is is larger or smaller than 21 and if it is the case i want to report and break SED .. how can i achieve it ? i dont want sed to scan the complete file after one non match is found. (4 Replies)
Discussion started by: boncuk
4 Replies

2. Shell Programming and Scripting

sed print from last occurrence match until the end of last occurrence match

Hi, i have file file.txt with data like: START 03:11:30 a 03:11:40 b END START 03:13:30 eee 03:13:35 fff END jjjjjjjjjjjjjjjjjjjjj START 03:14:30 eee 03:15:30 fff END ggggggggggg iiiiiiiiiiiiiiiiiiiiiiiii I want the below output START (13 Replies)
Discussion started by: Jyotshna
13 Replies

3. Shell Programming and Scripting

sed match exactly and delete

I am using following sed rule to delete 2 lines after a pattern match inclusive. # cat /tmp/temp.txt dns.com 11 22 mydns.com 11 22 dns.com.au 11 22 LAST LINE # cat /tmp/temp.txt | sed -e '/dns.com/,+2d' LAST LINE I just need to remove lines below dns.com only and NOT below... (5 Replies)
Discussion started by: anil510
5 Replies

4. Shell Programming and Scripting

Exact match using sed

I would like replace all the rows in a file if a row has an exact match to number say 21 in a tab delimited file. I want to delete the row only if it has 21 any of the rows but it should not delecte the row that has 542178 or 563421. I tried this sed '/\<21\>/d' ./inputfile > output.txt ... (7 Replies)
Discussion started by: Kanja
7 Replies

5. Shell Programming and Scripting

Use of sed to find everything after first match!

Hi Guys So far I have got this to work: set x = temp1:temp2:temp3 echo $x | sed 's/.*:\(.*\).*/\1/' Answer: temp3 But I want answer as temp2:temp3, that is everything after the first ":" is found. If anybody can help with a bit of description that will be great. Thanks in Advance (1 Reply)
Discussion started by: dixits
1 Replies

6. Shell Programming and Scripting

Sed Pattern Match

Hi, I would like to use SED to do the following string replacement: asd1abc to www1cda asd2abc to www2cda ... asd9abc to www9cda I can use 'asd.abc' to find the orignal string, however I don't know how to generate the target string. Any suggestion? Thanks, ... (2 Replies)
Discussion started by: mail4mz
2 Replies

7. UNIX for Dummies Questions & Answers

sed can't match '\n' ?!

Hi: it seems very strange. there is a file with multiple lines. After I squeezed out the consecutive blank lines (and some other text processing), somehow the sed '/\n/! d' file can not generate any output, as if it can't find any line with newline. the file is has many lines, so... (9 Replies)
Discussion started by: phil518
9 Replies

8. Shell Programming and Scripting

Multiple line match using sed

Please help! Input pattern, where ... could be any number of lines struct A { Blah1 Blah2 Blah3 ... } B; output pattern struct AB { Blah1 Blah2 Blah3 ... }; I need help in extracting everything between { and } if it would have been on a single line { \(.*\)} should have worked. (15 Replies)
Discussion started by: SiftinDotCom
15 Replies

9. Shell Programming and Scripting

how do I negate a sed match

I have a text file that has links in it. I can write a match for sed to replace the link with anything. For example: http://www.google.com becomes XxX But what I'm after is not to replace the link with something but to remove everything else and just leave the link. I want a... (5 Replies)
Discussion started by: muxman
5 Replies

10. UNIX for Dummies Questions & Answers

How can I match . (actual dot) using sed?

Hi All, How can I match . (actual dot) using sed? Please help. Thanks (9 Replies)
Discussion started by: jingi1234
9 Replies
Login or Register to Ask a Question