blank space in regex pattern using sed


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers blank space in regex pattern using sed
# 1  
Old 03-08-2010
blank space in regex pattern using sed

why does sed 's/.* //' show the last word in a line

and

sed 's/ .*//' show the first word in a line? How is that blank space before or after the ".*" being interpreted in the regex?

i would think the first example would delete the first word and the next example would delete the second word because you have .* to match any number of characters and then a space substituted with nothing, should'nt that remove the first matched word on the line that is terminated with whitespace?

---------- Post updated at 07:07 PM ---------- Previous update was at 06:49 PM ----------

I think i get it... if you do /.* // that will grab the first thing on the line if it has no space before it, and replace it will whitespace, and then print the rest of the line,
but if you do / .*// it will ignore what is before the first space, and begin substitution after the first space removing the rest of the line by substituting white space. Correct?
# 2  
Old 03-08-2010
Quote:
Originally Posted by glev2005
why does sed 's/.* //' show the last word in a line

and

sed 's/ .*//' show the first word in a line? How is that blank space before or after the ".*" being interpreted in the regex?

i would think the first example would delete the first word and the next example would delete the second word because you have .* to match any number of characters and then a space substituted with nothing, should'nt that remove the first matched word on the line that is terminated with whitespace?

---------- Post updated at 07:07 PM ---------- Previous update was at 06:49 PM ----------

I think i get it... if you do /.* // that will grab the first thing on the line if it has no space before it, and replace it will whitespace, and then print the rest of the line,
but if you do / .*// it will ignore what is before the first space, and begin substitution after the first space removing the rest of the line by substituting white space. Correct?
Well, no, not completely correct.

It is helpful to use an interactive regex tool. Try the one at http://gskinner.com/RegExr/ or download one for your OS.

With respect to the particular two regex patterns that you asked about, look at the individual elements and you see why you get what you get:

' ' - a space - matches that ASCII character. Not a tab, not a new line, only a space. ASCII character 32.

'.' - is a RegEx 'range' match. It means match any single character BUT a newline in line mode. It is important to understand that '.' is a single character. [a-z] is also a range like '.', but matches a single character from the more limited set of 'abcdefghijklmnopqrstuvwxyz'

'*' - is a RegEx quantifier, ie, 'how many' of the proceeding item. In this case, * means zero or more with as many as will match.

Those are the individual things, now look at how they combine:

/ .*/ means 'match 1 and only one space and only a space, then match every character and any character until the end of the line.' Next, look at the replacement. You have 's/ .*//'. That means 'leave the line alone until you match a single space. Match as many of any characters except new line and replace with nothing.' If you add a space at the beginning of your line, the first word would be deleted as well.

/.* / means 'match any character except new line (the '.'); do that as many as you possible can (the '*') until you either hit the end of the line or a space.'

s/.* // means 'match any character, as many as you can, including skipping over spaces (because it is greedy) until you come to the last space before the end of the line so the entire match is true, and replace with nothing.' Hence it matches everything up to and including the space before the last word on the line. If you added a space after the final word, it will delete that word too.

If you changed the pattern to s/.*? // then it is "ungreedy" meaning its stops at the shortest match, where s/.* // is "greedy" -- it will take as many characters as possible that match the match. s/.*? // will there for only delete the first word and following space. If you have a space at the beginning of the line, it will stop at the space because '*' means 'zero or more.'

Play with the interactive form and it will hit you like lighting at a certain point...

Last edited by drewk; 03-08-2010 at 10:29 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

sed REGEX to print multiple occurrences of a pattern from a line

I have a line that I need to parse through and extract a pattern that occurs multiple times in it. Example line: getInfoCall: info received please proceed, getInfoCall: info received please proceed, getInfoCall: info received please proceed, getInfoCall: info received please proceed,... (4 Replies)
Discussion started by: Vidhyaprakash
4 Replies

2. Shell Programming and Scripting

Regex in sed to find specific pattern and assign to variable

(5 Replies)
Discussion started by: radioactive9
5 Replies

3. Shell Programming and Scripting

SED: Pattern repitition regex matching

Fairly straightforward, but I'm having an awful time getting what I thought was a simple regex to work. I'll give the command I was playing with, and I'm aware why this one doesn't work (the 1,3 is off the A-Z, not the whole expression), I just don't know what the fix is: Actual Output(s): $... (5 Replies)
Discussion started by: Vryali
5 Replies

4. Shell Programming and Scripting

Sed delete blank lines upto first pattern match

Hi Im trying to do the following in sed. I want to delete any blank line at the start of a file until it matches a pattern and then stops. for example: Input output: I have got it to work within a range of two patterns with the following: sed '/1/,/pattern/{/^]*$/d}' The... (2 Replies)
Discussion started by: duonut
2 Replies

5. Shell Programming and Scripting

Need sed help: find regex and if the next next line is blank, delete both

I've got a report I need to make easier to read Using sh on HP-UX 11.12. In short, I want to search for a regular expression and when found, examine the next line to see if it's blank. If so, then delete both lines. If not blank, move on to the next regexp. Repeat. So far I've got: ... (7 Replies)
Discussion started by: Scottie1954
7 Replies

6. Shell Programming and Scripting

Replace comma with a blank space using SED

Hello everyone, I want to replace all "," (commas) with a blank space My command thus far is: cat test.text | sed -e s/\`//g | awk '{print$1" "$2" "$3}' I'm sure you guys know this, but the SED command that I am using is to get rid of the "`" (tics). which gives me: name ... (5 Replies)
Discussion started by: jayT
5 Replies

7. Shell Programming and Scripting

awk or sed command to print specific string between word and blank space

My source is on each line 98.194.245.255 - - "GET /disp0201.php?poc=4060&roc=1&ps=R&ooc=13&mjv=6&mov=5&rel=5&bod=155&oxi=2&omj=5&ozn=1&dav=20&cd=&daz=&drc=&mo=&sid=&lang=EN&loc=JPN HTTP/1.1" 302 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.0.3705; .NET CLR... (5 Replies)
Discussion started by: elamurugu
5 Replies

8. Shell Programming and Scripting

Selecting a part of the text (regex pattern, awk, sed)

Hello, let's start by giving you guys a few examples of the text: "READ /TEXT123/ABC123" "READ /TEXT123/ABC123/" "READ TEXT123/ABC123" "READ TEXT123/ABC123/" "READ TEXT123/TEXT456/ABC123" "READ /TEXT123/TEXT456/ABC123" "READ /TEXT123/TEXT456/ABC123/" TEXT and ABC can be and I... (5 Replies)
Discussion started by: TehOne
5 Replies

9. Shell Programming and Scripting

sed to awk (regex pattern) how?

Hello, I am trying to covert a for statement into a single awk script and I've got everything but one part. I also need to execute an external script when "not found", how can I do that ? for TXT in `find debugme -name "*.txt"` ;do FPATH=`echo $TXT | sed 's/\(.*\)\/\(.*\)/\1/'` how... (7 Replies)
Discussion started by: TehOne
7 Replies

10. Shell Programming and Scripting

Regex/sed - matching any char,space,underscore between : and /

trying to remove the portion in red: Data: mds_ar/bin/uedw92wp.ksh: $AI_SQL/wkly.sql mds_ar/bin/uedw92wp.ksh: $EDW_TMP/wkly.sql output to be: mds_ar/bin/uedw92wp.ksh: wkly.sql mds_ar/bin/uedw92wp.ksh: wkly.sql SED i'm trying to use: sed 's/:+\//: /g' input_file.dat >... (11 Replies)
Discussion started by: danmauer
11 Replies
Login or Register to Ask a Question