Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 04-14-2012
Registered User
 
Join Date: Apr 2012
Posts: 58
Thanks: 30
Thanked 0 Times in 0 Posts
gensub and arraywith awk

Hi Unix.com !

I would need some help for something I don't understand

input:

Code:
111|2 Y Z blue.
333|4 W X blue.; 5 Y Z red.
666|7 W X red.; 8 Y Z blue.
999|10 U V red.; 11 W X blue.; 12 Y Z red.

From $2, I would like to remove the sub-strings containing "blue" (and the associated "; " before or after), and print the new strings in $3, in order to obtain this output:

Code:
111|2 Y Z blue.|
333|4 W X blue.; 5 Y Z red.|5 Y Z red.
666|7 W X red.; 8 Y Z blue.|7 W X red.
999|10 U V red.; 11 W X blue.; 12 Y Z red.|10 U V red.; 12 Y Z red.

I tried:

Code:
BEGIN{FS=OFS="|"}

NR==1{
          $3 = "Field3"
          }
NR>1{

    $3 = $2
	
    y = split($3,x,"; ")
        for (i=1; i<=y; i++){
        	
            if (x[i] ~ /blue/ && y=1){
            	$3 = gensub(x[i],"","g",$3)
        	}
        	
            else if (x[i] ~ /blue/ && i<y)
            	$3 = gensub(x[i]"; ","","g",$3)
            
            else if (x[i] ~ /blue/ && i=y)
                $3 = gensub("; "x[i],"","g",$3)
        }
        
    
}1

I cannot get rid of the "; " !


Code:
Field1|Field2|Field3
111|2 Y Z blue.|
333|4 W X blue.; 5 Y Z red.|; 5 Y Z red.
666|7 W X red.; 8 Y Z blue.|7 W X red.; 
999|10 U V red.; 11 W X blue.; 12 Y Z red.|10 U V red.; ; 12 Y Z red.

I tried to write the regexp with slashes, double quotes, braces, brackets in every single positions ... always the same results.

HELP !!!!

Last edited by beca123456; 04-14-2012 at 04:47 AM..
Sponsored Links
    #2  
Old 04-14-2012
Registered User
 
Join Date: Oct 2010
Location: Bilbao, Spain
Posts: 628
Thanks: 8
Thanked 173 Times in 171 Posts
Hi beca123456,

One way:

Code:
$ cat infile
111|2 Y Z blue.
333|4 W X blue.; 5 Y Z red.
666|7 W X red.; 8 Y Z blue.
999|10 U V red.; 11 W X blue.; 12 Y Z red.
$ cat script.awk
BEGIN {
        FS = OFS = "|";
}

{
        split( $2, f, /;/ );
        for ( i = 1; i <= length( f ); i++ ) {
                if ( match( f[i], /blue/ ) >  0 ) {
                        continue;
                }
                $3 = $3 ( $3 ? ";" : "" ) f[i];
        }

        print $0;
}
$ awk -f script.awk infile
111|2 Y Z blue.
333|4 W X blue.; 5 Y Z red.| 5 Y Z red.
666|7 W X red.; 8 Y Z blue.|7 W X red.
999|10 U V red.; 11 W X blue.; 12 Y Z red.|10 U V red.; 12 Y Z red.

The Following User Says Thank You to birei For This Useful Post:
beca123456 (04-14-2012)
Sponsored Links
    #3  
Old 04-14-2012
Registered User
 
Join Date: Apr 2012
Posts: 58
Thanks: 30
Thanked 0 Times in 0 Posts
Thanks birei, works great !

Although, I have a question. I am not sure I understand this part:

Code:
$3 = $3 ( $3 ? ";" : "" ) f[i]

Tell me if I am wrong (I make it matches with colour):

Code:
$3 = $3 ( $3 ? ";" : "" ) f[i]

If the condition above is true, then in $3 print the modified field 3 if it exists, followed by ";" or nothing depending if it is the end of the field or not ?
So in this case, why do you add f[i] at the end ?

Does ":" stand for a logical OR?
    #4  
Old 04-15-2012
Scrutinizer's Avatar
Moderator
 
Join Date: Nov 2008
Location: Amsterdam
Posts: 7,341
Thanks: 144
Thanked 1,754 Times in 1,591 Posts
f[i] gets added because that needed to get added in the first place, the part in parentheses is just to determine if a ";" should be put in between.

Alternative version:

Code:
awk -F'[|]|; ' '{s=""; for(i=2;i<=NF;i++) if($i!~/blue/) s=s (s?"; ":"") $i; print $0"|"s}' infile


Last edited by Scrutinizer; 04-15-2012 at 12:43 AM..
The Following User Says Thank You to Scrutinizer For This Useful Post:
beca123456 (04-15-2012)
Sponsored Links
    #5  
Old 04-15-2012
Registered User
 
Join Date: Apr 2012
Posts: 58
Thanks: 30
Thanked 0 Times in 0 Posts
Thanks Scrutinizer !

I get it now.
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
gawk and gensub ripat Shell Programming and Scripting 2 05-10-2008 03:20 PM



All times are GMT -4. The time now is 09:42 AM.