Help with awk replacing identical columns based on another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with awk replacing identical columns based on another file
# 1  
Old 11-26-2012
Help with awk replacing identical columns based on another file

Hello,

I am using Awk in UBUNTU 12.04.

I have a file like following with three fields and 44706 rows.
F1 A A
F2 G G
F3 A T

I have another file like this:

AL_1 F1 A A
AL_2 F1 A T
AL_3 F1 A A
AL_1 F2 G G
AL_2 F2 G A
AL_3 F2 G G
BO_1 F1 A A
BO_2 F1 A T
BO_1 F2 G A
BO_2 F2 G G
CO_1 F1 A T
CO_2 F1 T T
CO_1 F2 G G
CO_2 F2 G A

( I didn't write for F3 anymore but it contains F3 as well)

This sounds a bit complex. What I want to have is like this:

F1 A T
F2 G A
F3 A T

Sorry as this sounds complex. So, what I want to do is to look in the second file for a "F1" whose 3rd and 4th fields are not the same (i.e. A T instead of A A) and replace it in the first file for "F1".

So, for F2, F3 and the rest.

Thank you very much for any help.
# 2  
Old 11-26-2012
Try this,
Code:
awk 'NR==FNR{Ar[$1$2$3]=1; next} !($2$3$4 in Ar){Ar[$2$3$4]=1;print $2,$3,$4}' file1 file2
F1 A T
F2 G A
F1 T T

# 3  
Old 11-26-2012
Try:
Code:
awk 'NR==FNR&&$3!=$4{a[$2]=$3" "$4}NR==FNR{next}$1 in a{$2=a[$1];$3="";}1' file2 file1

This User Gave Thanks to bartus11 For This Post:
# 4  
Old 11-26-2012
Try

Code:
awk 'NR==FNR{X[$1]=$2" "$3;next}{if(X[$2]){split(X[$2],P," ")
if(P[1] == P[2]){if($3 != $4){print $2,$3,$4;delete X[$2]}}else{print $2,X[$2];delete X[$2]}}}' file1 file2

This User Gave Thanks to pamu For This Post:
# 5  
Old 11-26-2012
Sorry, but both bartus11's and pamu's code are missing the F1 T T record in your sample file.
# 6  
Old 11-26-2012
But the output with bartus11 code is correct. I don't want all the lines of my second file and particularly, I don't want to have the same columns for a given F.

For example, I don't want to have T T for F1 but in stead if there is A T for F1, I want to have that in my output.

Thank you.
# 7  
Old 11-26-2012
Quote:
Originally Posted by Homa
But the output with bartus11 code is correct. I don't want all the lines of my second file and particularly, I don't want to have the same columns for a given F.

For example, I don't want to have T T for F1 but in stead if there is A T for F1, I want to have that in my output.

Thank you.
If your all file1 entries are present in file2 then my script will produce correct output as what you need.
This User Gave Thanks to pamu For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replacing 12 columns of one file by second file based on mapping in third file

i have a real data prod file with 80+ fields containing 1k -2k records. i have to extract say 12 columns out of this which are sensitive fields along with one primary key say SEQ_ID (like DOB,account no, name, SEQ_ID, govtid etc) in a lookup file. i have to replace these sensitive fields in... (11 Replies)
Discussion started by: megh12
11 Replies

2. Shell Programming and Scripting

Match files based on either of the two columns awk

Dear Shell experts, I have 2 files with structure: File 1: ID and count head test_GI_count1.txt 1000094 2 10039307 1 10039641 1 10047177 11 10047359 1 1008555 2 10120302 1 10120672 13 10121776 1 10121865 32 And 2nd file: head Protein_gi_GeneID_symbol.txt protein_gi GeneID... (11 Replies)
Discussion started by: smitra
11 Replies

3. Shell Programming and Scripting

awk script to split file into multiple files based on many columns

So I have a space delimited file that I'd like to split into multiple files based on multiple column values. This is what my data looks like 1bc9A02 1 10 1000 FTDLNLVQALRQFLWSFRLPGEAQKIDRMMEAFAQRYCQCNNGVFQSTDTCYVLSFAIIMLNTSLHNPNVKDKPTVERFIAMNRGINDGGDLPEELLRNLYESIKNEPFKIPELEHHHHHH 1ku1A02 1 10... (9 Replies)
Discussion started by: viored
9 Replies

4. Shell Programming and Scripting

How to add columns based on a pattern using awk?

Hi, I have a file with more than 1000 lines with ~14 columns. I need to find all the lines with matching value in column 14 and then add column 6 in all the lines before printing them out.. e.g if this is the input file: abc test input 10 for process 2345 abc test input 15 for process 2348... (1 Reply)
Discussion started by: xkdasari
1 Replies

5. Shell Programming and Scripting

awk based script to ignore all columns from a file which contains character strings

Hello All, I have a .CSV file where I expect all numeric data in all the columns other than column headers. But sometimes I get the files (result of statistics computation by other persons) like below( sample data) SNO,Data1,Data2,Data3 1,2,3,4 2,3,4,SOME STRING 3,4,Inf,5 4,5,4,4 I... (9 Replies)
Discussion started by: ks_reddy
9 Replies

6. Shell Programming and Scripting

awk based script to find the average of all the columns in a data file

Hi All, I need the modification for the below mentioned code (found in one more post https://www.unix.com/shell-programming-scripting/27161-script-generate-average-values.html) to find the average values for all the columns(but for a specific rows) and print the averages side by side. I have... (4 Replies)
Discussion started by: ks_reddy
4 Replies

7. Shell Programming and Scripting

Awk based script to find the median of all individual columns in a data file

Hi All, I have some data like below. Step1,Param1,Param2,Param3 1,2,3,4 2,3,4,5 2,4,5,6 3,0,1,2 3,0,0,0 3,2,1,3 ........ so on Where I need to find the median(arithmetic) of each column from Param1...to..Param3 for each set of Step1 values. (Sort each specific column, if the... (5 Replies)
Discussion started by: ks_reddy
5 Replies

8. Shell Programming and Scripting

How to change value in CSV columns and compare two files where Column1 is identical

Hi all, Could someone help me with the following issue: 1st I have an CSV file delimiter is ";" I I have a column 7 where I need to do some multiple mathem. operation, I need all values in this columns to be multiplied by 1.5 and create a new CSV file with the replaced values. 2nd. I... (3 Replies)
Discussion started by: kl1ngac1k
3 Replies

9. Shell Programming and Scripting

Help with Awk finding and replacing a field based on a condition

Hi everybody, I'm trying to replace the $98 field with "T" if the last field (108th) is T I've tried awk 'BEGIN{OFS=FS="|"} {if ($108=="T")sub($98,"T"); print}' test.txt but that doesn't do anything also tried awk 'BEGIN{OFS=FS="|"}{ /*T.$/ sub($98,"T")} { print}' test.txt but... (2 Replies)
Discussion started by: jghi123
2 Replies

10. Shell Programming and Scripting

Replacing columns into another file

Hi, I have input file. File1: Seqno Name 121 name1 122 name2 123 name3 124 name4 We will send the file1 to some other team. They will replace name column with place in file1 and send back to us as file2. file2: Seqno Place 121 place1 122 place2 124 place3 (2 Replies)
Discussion started by: manneni prakash
2 Replies
Login or Register to Ask a Question