Modifying col values based on another col | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Modifying col values based on another col

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 01-25-2013
newbie83 newbie83 is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 19 May 2014, 3:50 PM EDT
Posts: 86
Thanks: 56
Thanked 2 Times in 1 Post
Modifying col values based on another col

Hi,

Please help with this.
I have several excel files (with and .xlsx format) with 10-15 columns each.
They all have the same type of data but the columns are not ordered in the same way.

Here is a 3 column example. What I want to do add the alphabet
from column 2 to column 3, provided they both have the keyword 'BAM'
in them. Also the prefix 12 needs to be removed (if it is present before BAM).
Also the second column may have a space between BAM and 2A, that space needs to be ignored. So BAM2A is the same as BAM 2A, similarly if space is present in 3rd col, that has to be removed in the output.
So 12BAM2 -1 becomes BAM2A-1.
The last row remains unchanged because both column dont have BAM.

Input sample


Code:
20	12BAM 2A		12BAM2 -1
20	12BAM 2A		12BAM2 -2
50	12BAM2A   		12BAM2 -3
20	12BAM2B  		12BAM2 -4
20	12BAM 2B		12BAM2 -5
520	12BAM 2A		XA3023

Expected output


Code:
20	12BAM 2A		BAM2A-1
20	12BAM 2A		BAM2A-2
50	12BAM2A  	   	BAM2A-3
20	12BAM2B  		BAM2B-4
20	12BAM 2B		BAM2B-5
520	12BAM 2A		XA3023

Sponsored Links
    #2  
Old 01-25-2013
Yoda's Avatar
Yoda Yoda is offline Forum Advisor  
Jedi Master
 
Join Date: Jan 2012
Last Activity: 26 September 2014, 6:37 PM EDT
Location: Galactic Empire
Posts: 3,385
Thanks: 234
Thanked 1,208 Times in 1,134 Posts

Code:
awk -F'\t' '{
 i=$2;j=$2;
 sub(/[0-9]+/,"",i);
 sub(/[0-9].*/,"",i);
 sub(" ","",i);
 sub(i," ",j);
 sub(/^[0-9]+/,"",j);
 if($4~i) {
  sub(/^[0-9]+/,"",$4);
  sub(/[0-9]+/,j,$4);
  gsub(" ","",$4);
 }
}1' OFS='\t' file


Last edited by Yoda; 01-25-2013 at 12:22 PM.. Reason: Correction
The Following User Says Thank You to Yoda For This Useful Post:
newbie83 (01-25-2013)
Sponsored Links
    #3  
Old 01-25-2013
newbie83 newbie83 is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 19 May 2014, 3:50 PM EDT
Posts: 86
Thanks: 56
Thanked 2 Times in 1 Post
thanks, will this work on the excel files?
    #4  
Old 01-25-2013
Corona688 Corona688 is offline Forum Staff  
Mead Rotor
 
Join Date: Aug 2005
Last Activity: 1 October 2014, 1:35 PM EDT
Location: Saskatchewan
Posts: 19,480
Thanks: 796
Thanked 3,288 Times in 3,083 Posts
excel can so easily export and import tab-separated data that there's little point installing 19,000 perl modules and writing 3,000 lines of code just to write excel files.

Excel files are more complicated in several ways, too. You have to start worrying about multiple worksheets for instance, formulas, merged columns, and all of that. And the .xlsx files are particularly obnoxious.
The Following User Says Thank You to Corona688 For This Useful Post:
newbie83 (01-25-2013)
Sponsored Links
    #5  
Old 01-25-2013
newbie83 newbie83 is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 19 May 2014, 3:50 PM EDT
Posts: 86
Thanks: 56
Thanked 2 Times in 1 Post
I will try to run the code now with exported text, but naively looking at the code, it doesn't consider the keyword 'BAM' to be matched for both columns, is there another way that is being matched?
Sponsored Links
    #6  
Old 01-25-2013
Yoda's Avatar
Yoda Yoda is offline Forum Advisor  
Jedi Master
 
Join Date: Jan 2012
Last Activity: 26 September 2014, 6:37 PM EDT
Location: Galactic Empire
Posts: 3,385
Thanks: 234
Thanked 1,208 Times in 1,134 Posts
Actually it is considering the keyword BAM

I wrote it in a generic way so that other keywords will also be matched. Try running for a set of data and verify if you are getting the desired output. Just make sure you preserve the tab spaces since the code is using it as field separator.
Sponsored Links
    #7  
Old 01-25-2013
newbie83 newbie83 is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 19 May 2014, 3:50 PM EDT
Posts: 86
Thanks: 56
Thanked 2 Times in 1 Post
the code runs perfect with the sample data,

One of my actual input files has col 2 as col 8, and col 3 as col 7.
I can replace $2 by $8, what should i replace $4 with ? there is no mention of col3 in the code.
Sorry for the naive question :-/
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Printing from col x to end of line, except last col LMHmedchem Shell Programming and Scripting 8 01-07-2013 08:12 PM
Print line based on highest value of col (B) and repetion of values in col (A) imahmoud UNIX for Advanced & Expert Users 4 07-25-2011 12:04 PM
Match col 1 of File 1 with col 1 File 2 and create a 3rd file sogi Ubuntu 7 06-30-2009 11:13 PM
Awk to print distinct col values anduzzi Shell Programming and Scripting 22 08-18-2008 05:09 PM
Help On col command rahulrathod UNIX for Advanced & Expert Users 1 09-13-2004 06:41 AM



All times are GMT -4. The time now is 02:24 AM.