Matching information from different files


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Matching information from different files

I have a dataset with coordinates of certain points with their names.

file1:
Code:
AAA 5 7
BBB 7 2
CCC 4 4
DDD 4 4
EEE 8 9
FFF 9 9
GGG 4 9
HHH 32 1
III 7 6
JJJ 12 4
KKK 12 3
MMM 15 3

The letters represent names of points. In file1 they are unique.
I also have file2 with a dataset with some points that might be repeating unknown number of times, for which I need coordinates from file1 (coordinates are columns 2 and 3 in file1)
Not all points from file1 are in file2 and viceversa.

file 2:

Code:
AAA 1 -9.95 0.03 22.80 0.04 10.37 0.03
AAA 2 -9.95 0.03 22.80 0.04 10.37 0.03
AAA 3 -9.96 0.03 22.79 0.04 10.37 0.03
BBB 1 -40.34 0.04 2.67 0.04 48.91 0.04
BBB 1 -40.33 0.04 2.67 0.04 48.92 0.04
CCC 1 -19.37 0.04 16.77 0.04 18.46 0.03
DDD 1 -13.88 0.03 19.20 0.04 11.90 0.03
DDD 2 -13.88 0.03 19.20 0.04 11.89 0.03
DDD 3 -13.88 0.03 19.20 0.04 11.89 0.03
EEE 1 -15.97 0.03 -3.95 0.04 4.23 0.03
EEE 2 -15.97 0.03 -3.95 0.04 4.23 0.03
EEE 3 -15.97 0.03 -3.95 0.04 4.23 0.03
EEE 4 -15.97 0.03 -3.95 0.04 4.23 0.03
EEE 5 -15.97 0.03 -3.95 0.04 4.23 0.03
FFF 1 -39.11 0.04 -5.52 0.04 54.45 0.04
FFF 2 -39.11 0.04 -5.52 0.04 54.45 0.04
GGG 1 -21.55 0.03 -5.42 0.03 6.93 0.04
HHH 1 -14.83 0.03 0.52 0.04 -4.74 0.03
HHH 2 -14.83 0.03 0.52 0.04 -4.74 0.03
III 3 -14.84 0.03 0.52 0.04 -4.74 0.03
JJJ 1 -10.57 0.06 1.03 0.17 2.35 0.09
KKK 1 8.04 0.04 -11.54 0.04 -4.02 0.10
KKK 1 -17.40 0.04 -1.58 0.04 14.42 0.04
LLL 1-5.72 0.06 -1.54 0.06 13.93 0.04
LLL 2 -2.31 0.04 -3.73 0.05 12.45 0.04
LLL 3 -2.31 0.04 -3.73 0.05 12.45 0.04

wanted output:

Code:
AAA 1 -9.95 0.03 22.80 0.04 10.37 0.03 5 7
AAA 2 -9.95 0.03 22.80 0.04 10.37 0.03 5 7
AAA 3 -9.96 0.03 22.79 0.04 10.37 0.03 5 7
BBB 1 -40.34 0.04 2.67 0.04 48.91 0.04 7 2
BBB 1 -40.33 0.04 2.67 0.04 48.92 0.04 7 2
CCC 1 -19.37 0.04 16.77 0.04 18.46 0.03 4 4
DDD 1 -13.88 0.03 19.20 0.04 11.90 0.03 4 4
DDD 2 -13.88 0.03 19.20 0.04 11.89 0.03 4 4
DDD 3 -13.88 0.03 19.20 0.04 11.89 0.03 4 4
EEE 1 -15.97 0.03 -3.95 0.04 4.23 0.03 8 9
EEE 2 -15.97 0.03 -3.95 0.04 4.23 0.03 8 9
EEE 3 -15.97 0.03 -3.95 0.04 4.23 0.03 8 9
EEE 4 -15.97 0.03 -3.95 0.04 4.23 0.03 8 9
EEE 5 -15.97 0.03 -3.95 0.04 4.23 0.03 8 9
FFF 1 -39.11 0.04 -5.52 0.04 54.45 0.04 9 9
FFF 2 -39.11 0.04 -5.52 0.04 54.45 0.04 9 9
GGG 1 -21.55 0.03 -5.42 0.03 6.93 0.04 4 9
HHH 1 -14.83 0.03 0.52 0.04 -4.74 0.03 32 1
HHH 2 -14.83 0.03 0.52 0.04 -4.74 0.03 32 1
III 3 -14.84 0.03 0.52 0.04 -4.74 0.03 7 6
JJJ 1 -10.57 0.06 1.03 0.17 2.35 0.09 12 4
KKK 1 8.04 0.04 -11.54 0.04 -4.02 0.10 12 3
KKK 1 -17.40 0.04 -1.58 0.04 14.42 0.04 12 3


I want to add to points in file2 the coordinates from file1. Obviously for every instance of AAA in file2, the coordinates will be the same, however, the rest of the line for AAA is not necessarily the same.

Since the fields in file2 repeat I don't know how I would match and paste the values from columns 2 and 3 in file1 to appropriate rows of file2.
If they weren't repeating I'd try with awk something like
Code:
awk '{k=$1} NR==FNR{a[k]; next} (k in a)' file1 file2

but that doesn't work here and I don't know how to start.

Moderator's Comments:
Mod Comment edit by bakunin: sorry for editing instead of answering the post. I have it restored now. My bad.

Last edited by bakunin; 4 Days Ago at 01:46 PM..
# 2  
Quote:
I'd try with awk something like
Code:
awk '{k=$1} NR==FNR{a[k]; next} (k in a)' file1 file2

but that doesn't work here and I don't know how to start.
You may want to start reading the man page of the join command. As i see it it does exactly what you want. Note that join expects both input files to be sorted (in regards to the common key you want to use for the join-operation) prior to using them.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 3  
Or try this modification:
Code:
awk '{k=$1} NR==FNR{a[k]=$2 OFS $3; next} k in a {print $0, a[k]}' file1 file2

Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Delete all files if another files in the same directory has a matching occurrence of a specific word
kridhick
he following are the files available in my directory RSK_123_20141113_031500.txt RSK_123_20141113_081500.txt RSK_126_20141113_041500.txt RSK_126_20141113_081800.txt RSK_128_20141113_091600.txt Here, "RSK" is file prefix and 123 is a code name and rest is just timestamp of the file when its...... Shell Programming and Scripting
7
Shell Programming and Scripting
search information in multiple files and save in new files
andreia
hi everyone, im stuck in here with shell :) can you help me?? i have a directory with alot files (genbank files ... all ended in .gbk ) more than 1000 for sure ... and i want to read each one of them and search for some information and if i found the right one i save in new file with new...... Shell Programming and Scripting
6
Shell Programming and Scripting
Matching two file contents and extracting associated information
newpro
Hi, I am new to shell programming and need help. I have File1 with some ID numbers and File2 with ID number and some associated information. I want to match the ID numbers from File1 to contents in File2 and output a third file which pulls out the ID numbers and the associated information with...... Shell Programming and Scripting
2
Shell Programming and Scripting
Delete all files if another files in the same directory has a matching occurence of a specific word
premier_de
Hello, I have several files in a specific directory. A specific string in one file can occur in another files. If this string is in other files. Then all the files in which this string occured should be deleted and only 1 file should remain with the string. Example. file1 ShortName "Blue...... Shell Programming and Scripting
2
Shell Programming and Scripting
compare two files and to remove the matching lines on both the files
shellscripter
I have two files and need to compare the two files and to remove the matching lines from both the files... Shell Programming and Scripting
4
Shell Programming and Scripting

Featured Tech Videos