NR==FNR trick for joining columns from two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting NR==FNR trick for joining columns from two files
# 8  
Old 10-11-2011
Quote:
Originally Posted by genehunter
...from foo.txt into a new output that has the bar.txt intact
What do you mean ?


1) Do you want only the foo entries who has there rs# in bar ?

2) Do you want only the foo entries who has there [0-9]##### in bar ?

3) Do you want the foo entries who has their rs# AND their [0-9]##### in bar ?

4) What if an entry is in bar but not in foo and all the replacement in foo has already been done, should they still be displayed or not ?

5) Are rs# uniq in foo ? in bar ?

6) Are [0-9]##### uniq in foo ? in bar ?

7) Or should only the pair rs# [0-9]##### be considered as primary key (uniq) ?
# 9  
Old 10-11-2011
Quote:
Originally Posted by ctsgnb
What do you mean ?


1) Do you want only the foo entries who has there rs# in bar ?
YES. Yes, this being the prime concern.

2) Do you want only the foo entries who has there [0-9]##### in bar ?
Want foo entries that match the [0-9]#### in bar.

3) Do you want the foo entries who has their rs# AND their [0-9]##### in bar ?
This is the perfect solution.

4) What if an entry is in bar but not in foo and all the replacement in foo has already been done, should they still be displayed or not ?
No entries that are unique to bar need be written to output. foobar should be displayed as only foo original entries with the columns with bar entries being blank.
e.g. :
Code:
1       rs6603791       0       1490804 G       A

5) Are rs# uniq in foo ? in bar ?
rs# are unique in foo and bar

6) Are [0-9]##### uniq in foo ? in bar ?
These are not unique in foo or bar. They are unique in foo when combined with col1.

7) Or should only the pair rs# [0-9]##### be considered as primary key (uniq) ?
yes, that is the preferred way to treat these as primary key.
Please, if you can explain your solution, deeply obliged!
Thanks
~GH
# 10  
Old 10-12-2011
Quote:
Originally Posted by genehunter
The example is a few lines from the real data, so should have worked fine. Can you redo your solution to find the rs### column as the index to find the correspondng rs# from the other file and write the corresponding column?
Code:
awk 'NR==FNR{x=$1;sub($1,y);a[x]=$0;next}a[$2]{$0=$0 a[$2]}1' OFS=\\t  bar.txt foo.txt

# 11  
Old 10-12-2011
Sorry but few adding question :

Which foo entry does NOT need to be displayed :

1. Any foo entry that has his rs# NOT in bar ?
2. Any foo entry that has his [0-9]# NOT in bar ?
3. Any foo entry that has his rs# OR his [0-9]# NOT in bar ?
4. Any foo entry that has his rs# AND his [0-9]# NOT in bar ?

Thanks in advance

If you want the case number 4. here you go :
Code:
nawk 'NR==FNR{i=$1$2;sub($1,z);a[i]=$0;next}(($2$4) in a){print $0 a[$2$4]}' bar.txt foo.txt


Last edited by ctsgnb; 10-12-2011 at 08:35 AM..
This User Gave Thanks to ctsgnb For This Post:
# 12  
Old 10-13-2011
Quote:
Originally Posted by ctsgnb
Sorry but few adding question :

Which foo entry does NOT need to be displayed :

1. Any foo entry that has his rs# NOT in bar ?
Unsure, Very unlikely. How to test this?

2. Any foo entry that has his [0-9]# NOT in bar ?
Unsure, Very unlikely. How to test this?

3. Any foo entry that has his rs# OR his [0-9]# NOT in bar ?
Unsure, Very unlikely. How to test this?

4. Any foo entry that has his rs# AND his [0-9]# NOT in bar ?
Unsure, Very unlikely. How to test this?

Thanks in advance

If you want the case number 4. here you go :
Code:
nawk 'NR==FNR{i=$1$2;sub($1,z);a[i]=$0;next}(($2$4) in a){print $0 a[$2$4]}' bar.txt foo.txt

# 13  
Old 10-14-2011
Code:
Unsure, Very unlikely. How to test this?

If you are unsure about what you expect, i am afraid i can't help much.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Joining Two Files Matching Two Columns

Hi All, I am looking to join two files where column 1 of file A matches with column 1 of file B and column 5 of files A matches with column 2 of file B. After joining the files based on above condition, out should contain entire line of file A and column 3, 4 and 5 of file B. Here is sample... (8 Replies)
Discussion started by: angshuman
8 Replies

2. Shell Programming and Scripting

Joining files using awk not extracting all columns from File 2

Hello All I'm joining two files using Awk by Left outer join on the file 1 File 1 1 AA 2 BB 3 CC 4 DD File 2 1 IND 100 200 300 2 AUS 400 500 600 5 USA 700 800 900 (18 Replies)
Discussion started by: venkat_reddy
18 Replies

3. UNIX for Dummies Questions & Answers

Joining different columns from multiple files

Hello again, I am trying to join 3rd column of 3 files into the end on one file and save it separately... my data looks like this file 1 Bob, Green, 80 Mark, Brown, 70 Tina, Smith, 60 file 2 Bob, Green, 70 Mark, Brown, 60 Tina, Smith, 50 file 3 Bob, Green, 50 Mark, Brown,60 Tina,... (6 Replies)
Discussion started by: A-V
6 Replies

4. Shell Programming and Scripting

Retain the unmatched columns while joining

I have 2 files file1 id city car type model 1 york subaru impreza king 2 kampala toyota corolla sissy 3 luzern chrysler gravity falcon file2 id name rating 3 zanzini PG 2 tara X when i use join sorted_file1 sorted_file2 >output i get something like... (2 Replies)
Discussion started by: anurupa777
2 Replies

5. Shell Programming and Scripting

Other alternative for joining together columns from multiple files

Hi again, I have monthly one-column files of roughly around 10 years. Is there a more efficient way to concatenate these files column-wise other than using paste command? For instance: file1.txt 12 13 15 12 file2.txt 14 15 18 19 file3.txt 20 21 (8 Replies)
Discussion started by: ida1215
8 Replies

6. Shell Programming and Scripting

Help with joining files and adding headers to files

Hi, I have about 20 tab delimited text files that have non sequential numbering such as: UCD2.summary.txt UCD45.summary.txt UCD56.summery.txt The first column of each file has the same number of lines and content. The next 2 column have data points: i.e UCD2.summary.txt: a 8.9 ... (8 Replies)
Discussion started by: rrdavis
8 Replies

7. Shell Programming and Scripting

Transposing column to row, joining with another file, then sorting columns

Hello! I am very new to Linux and I do not know where to begin... I have a column with >64,000 elements (that are not in numberical order) like this: name 2 5 9 . . . 64,000 I would like to transpose this column into a row that will later become the header of a very large file... (2 Replies)
Discussion started by: doobedoo
2 Replies

8. Shell Programming and Scripting

awk NR==FNR compare 2 files produce a 3rd

hi, i have two files, both with 3 columns, the 3rd column has common values between the two files and i want to produce a 3rd file with 4 columns. file 1 a, ,b c file 2 a, b ,d I want to compare the 3rd value and if a match print to file 3 with the 3 columns from the first file... (11 Replies)
Discussion started by: borderblaster
11 Replies

9. Shell Programming and Scripting

Joining two files based on columns/fields

I've got two files, File1 and File2 File 1 has got combination of col1, col2 and col3 which comes on file2 as well, file2 does not get col4. Now based on col1, col2 and col3, I would like to get col4 from file1 and all the columns from file2 in a new file Any ideas? File1 ------ Col1 col2... (11 Replies)
Discussion started by: rudoraj
11 Replies

10. Shell Programming and Scripting

Joining columns from two files, if the key matches

I am trying to join/paste columns from two files for the rows with matching first field. Any help will be appreciated. Files can not be sorted and may not have all rows in both files. Thanks. File1 aaa 111 bbb 222 ccc 333 File2 aaa sss mmmm ccc kkkk llll ddd xxx yyy Want to... (1 Reply)
Discussion started by: sk_sd
1 Replies
Login or Register to Ask a Question