Understanding sed | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Understanding sed

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 03-01-2013
panyam panyam is offline Forum Advisor  
Registered User
 
Join Date: Sep 2008
Last Activity: 24 July 2014, 3:48 AM EDT
Posts: 1,156
Thanks: 20
Thanked 104 Times in 99 Posts
Understanding sed

Hi,

can some one suggest me,how "sed" is managed to delete the second field here.

Any explanation on , how the below code is working would be appreciated.


Code:
sed 's/^\([^:]*\):[^:]:/\1::/'  /etc/passwd

sed 's/[^:]*:/:/2'  /etc/passwd

Sponsored Links
    #2  
Old 03-01-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 29 July 2014, 1:11 AM EDT
Location: Amsterdam
Posts: 9,281
Thanks: 260
Thanked 2,303 Times in 2,066 Posts
Hi, the first one deletes the second field only if it consists of one character... It replace the first field and a colon + a character and a colon with the first field (that was captured with \( .. \) and recalled by \1 ) and two colons

The second replaces any number of non-colons and a colon by a colon, the number two means the it does this with the second occurrence on a line...
The Following User Says Thank You to Scrutinizer For This Useful Post:
panyam (03-01-2013)
Sponsored Links
    #3  
Old 03-01-2013
panyam panyam is offline Forum Advisor  
Registered User
 
Join Date: Sep 2008
Last Activity: 24 July 2014, 3:48 AM EDT
Posts: 1,156
Thanks: 20
Thanked 104 Times in 99 Posts
Hi Scrutinizer,

Thanks for the reply. I got your second correctly.

Your first answer is slightly confusing me!..

Let's say the sample input is:


Code:
abcd:x:panyam:panyam:Panyam:512

echo "abcd:x:panyam:panyam:Panyam:512" | sed 's/^\([^:]*\):[^:]:/\1::/' gives me:

abcd::panyam:panyam:Panyam:512

\([^:]*\) matches : abcd
[^:] matches :x

\1 : prints only "abcd"..

Now, how the rest of the line is coming in output as it is? I mean
Code:
"panyam:panyam:Panyam:512"

.

Is it because sed prints the non matching patterns as it is?

Is my understanding correct?
    #4  
Old 03-01-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 29 July 2014, 1:11 AM EDT
Location: Amsterdam
Posts: 9,281
Thanks: 260
Thanked 2,303 Times in 2,066 Posts
Hi Panyam, yes the rest of the line remains unaltered. It is not part of the substitution, so it gets printed like it is.
Sponsored Links
    #5  
Old 03-01-2013
gary_w's Avatar
gary_w gary_w is offline
Registered User
 
Join Date: Oct 2010
Last Activity: 10 April 2014, 2:21 PM EDT
Posts: 446
Thanks: 32
Thanked 96 Times in 88 Posts
An attempt to further explain the regular expression, with an ulterior motive of setting up for a question.

Given:

Code:
sed 's/^\([^:]*\):[^:]:/\1::/'

search for a pattern in the string matching:

Code:
^    = start of the line
\(   = Start of first remembered pattern
[^:] = followed by any character that is not a :
*    = followed by any number of the previous character class
       (characters that are not colons)
\)   = end of first remembered pattern
:    = followed by a colon
[^:] = followed by any character that is not a colon
:    = followed by a colon

<describes the first 2 fields, along with their seperators>

Replace with:

Code:
\1   = the first remembered pattern (the first field)
::   = followed by 2 literal colons

In other words, replace the first 2 colon separated fields
with the first field and 2 colons (deletes the 2nd field).

Question: if this sed command was in a script, could it be commented like I did above in the code? Can a sed regex be multi-line with comments?

One could also do:

Code:
s/:[^:]*:/::/

Sponsored Links
    #6  
Old 03-01-2013
alister alister is offline
Registered User
 
Join Date: Dec 2009
Last Activity: 11 June 2014, 8:40 PM EDT
Posts: 3,231
Thanks: 179
Thanked 973 Times in 789 Posts
Quote:
Originally Posted by gary_w View Post
Question: if this sed command was in a script, could it be commented like I did above in the code? Can a sed regex be multi-line with comments?
Nope. You can have comments in a sed script, but not within a regular expression. What you are asking is possible with perl if you use the /x regular expression modifier.

Regards,
Alister

---------- Post updated at 10:54 PM ---------- Previous update was at 10:49 PM ----------

Quote:
Originally Posted by gary_w View Post
One could also do:

Code:
s/:[^:]*:/::/

s/:[^:]*/:/ would work just as well, unless it's necessary to prevent the last field in a line without a trailing colon from matching, or even s/[^:]*//2 .

REgards,
Alister
Sponsored Links
    #7  
Old 03-02-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 29 July 2014, 1:11 AM EDT
Location: Amsterdam
Posts: 9,281
Thanks: 260
Thanked 2,303 Times in 2,066 Posts
Quote:
Originally Posted by alister View Post
[..] or even s/[^:]*//2
That is what I would think too, but this does not work like that everywhere.. This works wiith GNU sed and sed on AIX7 and with regular sed on Solaris, but not with /usr/xpg4/bin/sed on Solaris nor with sed on HPUX and OSX and some other UNIX flavor.

In those cases where it does not work, the desired effect was obtained when s/[^:]*//3 was used instead (and for the 3rd field s/[^:]*//5 and so on).

How can this be? What I think this may have to do with how the respective regex engines interpret a zero match after a previous match. The first match of

echo aaa:bbb:ccc:ddd:eee:fff | sed 's/[^:]*//'

renders

Code:
:bbb:ccc:ddd:eee:fff

On this every engine agrees. After the first match the engine arrives after the previous match and before the first colon. But what then constitutes the next match? For GNU sed and some other mentioned above this apparently means the next iteration of non-colon characters after the first colon. But the other engines apparently interpret zero repetitions of the non-colons before the colon as the next match, which constitutes an empty string and which I guess could be labeled as a "strict" interpretation of [^:]* .

Anyway, it seems safest to include one colon in the match line in the OP's second example, or insist a pattern of 1 or more non-colons, i.e. sed 's/[^:][^:]*//2' or sed 's/[^:]\{1,\}//2'

Regards,

S.

Last edited by Scrutinizer; 03-02-2013 at 04:44 AM..
The Following 2 Users Say Thank You to Scrutinizer For This Useful Post:
alister (03-02-2013), jim mcnamara (03-02-2013)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
help understanding regex with grep & sed trogdortheburni Shell Programming and Scripting 4 10-13-2012 09:44 PM
understanding sed command forroughuse UNIX for Dummies Questions & Answers 3 11-11-2011 06:21 AM
Help understanding sed Makaer UNIX for Dummies Questions & Answers 6 10-01-2009 11:55 AM
understanding the sed command mac4rfree Shell Programming and Scripting 3 08-07-2009 02:24 AM



All times are GMT -4. The time now is 04:53 PM.