|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Modifying col values based on another col
Hi, Please help with this. I have several excel files (with and .xlsx format) with 10-15 columns each. They all have the same type of data but the columns are not ordered in the same way. Here is a 3 column example. What I want to do add the alphabet from column 2 to column 3, provided they both have the keyword 'BAM' in them. Also the prefix 12 needs to be removed (if it is present before BAM). Also the second column may have a space between BAM and 2A, that space needs to be ignored. So BAM2A is the same as BAM 2A, similarly if space is present in 3rd col, that has to be removed in the output. So 12BAM2 -1 becomes BAM2A-1. The last row remains unchanged because both column dont have BAM. Input sample Code:
20 12BAM 2A 12BAM2 -1 20 12BAM 2A 12BAM2 -2 50 12BAM2A 12BAM2 -3 20 12BAM2B 12BAM2 -4 20 12BAM 2B 12BAM2 -5 520 12BAM 2A XA3023 Expected output Code:
20 12BAM 2A BAM2A-1 20 12BAM 2A BAM2A-2 50 12BAM2A BAM2A-3 20 12BAM2B BAM2B-4 20 12BAM 2B BAM2B-5 520 12BAM 2A XA3023 |
| Sponsored Links | ||
|
|
#2
|
||||
|
||||
|
Code:
awk -F'\t' '{
i=$2;j=$2;
sub(/[0-9]+/,"",i);
sub(/[0-9].*/,"",i);
sub(" ","",i);
sub(i," ",j);
sub(/^[0-9]+/,"",j);
if($4~i) {
sub(/^[0-9]+/,"",$4);
sub(/[0-9]+/,j,$4);
gsub(" ","",$4);
}
}1' OFS='\t' fileLast edited by Yoda; 01-25-2013 at 12:22 PM.. Reason: Correction |
| The Following User Says Thank You to Yoda For This Useful Post: | ||
newbie83 (01-25-2013) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
thanks, will this work on the excel files?
|
|
#4
|
|||
|
|||
|
excel can so easily export and import tab-separated data that there's little point installing 19,000 perl modules and writing 3,000 lines of code just to write excel files.
Excel files are more complicated in several ways, too. You have to start worrying about multiple worksheets for instance, formulas, merged columns, and all of that. And the .xlsx files are particularly obnoxious. |
| The Following User Says Thank You to Corona688 For This Useful Post: | ||
newbie83 (01-25-2013) | ||
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
I will try to run the code now with exported text, but naively looking at the code, it doesn't consider the keyword 'BAM' to be matched for both columns, is there another way that is being matched?
|
| Sponsored Links | |
|
|
#6
|
||||
|
||||
|
Actually it is considering the keyword
BAM
I wrote it in a generic way so that other keywords will also be matched. Try running for a set of data and verify if you are getting the desired output. Just make sure you preserve the tab spaces since the code is using it as field separator. |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
the code runs perfect with the sample data,
One of my actual input files has col 2 as col 8, and col 3 as col 7. I can replace $2 by $8, what should i replace $4 with ? there is no mention of col3 in the code. Sorry for the naive question :-/ |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Printing from col x to end of line, except last col | LMHmedchem | Shell Programming and Scripting | 8 | 01-07-2013 08:12 PM |
| Print line based on highest value of col (B) and repetion of values in col (A) | imahmoud | UNIX for Advanced & Expert Users | 4 | 07-25-2011 12:04 PM |
| Match col 1 of File 1 with col 1 File 2 and create a 3rd file | sogi | Ubuntu | 7 | 06-30-2009 11:13 PM |
| Awk to print distinct col values | anduzzi | Shell Programming and Scripting | 22 | 08-18-2008 05:09 PM |
| Help On col command | rahulrathod | UNIX for Advanced & Expert Users | 1 | 09-13-2004 06:41 AM |
|
|