Join not working


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Join not working
# 1  
Old 08-17-2014
Join not working

Hi all,
I'm trying to use the join command to merge two files, but it's not finding lots of the matches.
I have three files in total:
Code:
File A:
31_77
34_46
72_61
85_10
85_23
110_33
144_45
154_25
154_90
170_5
170_44
217_63
255_19
333_20
333_23
333_32

File B:
31_77    0    31    77    31_77    1    0.0856171    -1.02857
34_46    0    34    46    34_46    2    0.089418    -1.0079
72_61    0    72    61    72_61    3    0.084617    -1.0341
85_10    0    85    10    85_10    4    0.085417    -1.0297
85_23    0    85    23    85_23    5    0.086617    -1.023
110_33    0    110    33    110_33    6    0.089218    -1.009
144_45    0    144    45    144_45    7    0.093019    -0.98903
170_5    0    170    5    170_5    8    0.082617    -1.0455
170_44    0    170    44    170_44    9    0.086217    -1.0252
255_19    0    255    19    255_19    10    0.095419    -0.97681
333_20    0    333    20    333_20    11    0.093019    -0.98903
333_23    0    333    23    333_23    12    0.079016    -1.0665
333_32    0    333    32    333_32    13    0.090618    -1.0015
419_8    0    419    8    419_8    14    0.097419    -0.96684
419_35    0    419    35    419_35    15    0.083617    -1.0398

File C:
   	 	 	 	 	 		 	31_77    0    31    77    31_77    1    0.0818164    -1.05009
110_33    0    110    33    110_33    2    0.088618    -1.0122
442_43    0    442    43    442_43    3    0.093819    -0.98493
442_76    0    442    76    442_76    4    0.093019    -0.98903
537_85    0    537    85    537_85    5    0.077816    -1.0738
559_32    0    559    32    559_32    6    0.11662    -0.87936
559_40    0    559    40    559_40    7    0.088218    -1.0143

When I do
Code:
join File_A File_B -a 1 >File_AB

I get:
Code:
31_77    0    31    77    31_77    1    0.0856171    -1.02857
34_46    0    34    46    34_46    2    0.089418    -1.0079
72_61    0    72    61    72_61    3    0.084617    -1.0341
85_10    0    85    10    85_10    4    0.085417    -1.0297
85_23    0    85    23    85_23    5    0.086617    -1.023
110_33    0    110    33    110_33    6    0.089218    -1.009
144_45    0    144    45    144_45    7    0.093019    -0.98903
154_25                            
154_90                            
170_5    0    170    5    170_5    8    0.082617    -1.0455
170_44    0    170    44    170_44    9    0.086217    -1.0252
217_63                            
255_19    0    255    19    255_19    10    0.095419    -0.97681

Which is what I was expecting.

However when I do the same thing with File A and File C it misses lots of matches and I get:
Code:
   	 	 	 	 	 		 	 	     	 	 		 			31_77 			0 			31 			77 			31_77 			1 			0.0818164 			-1.05009 		 		 			34_46 			
			
			
			
			
			
			
		 		 			72_61 			
			
			
			
			
			
			
		 		 			85_10 			
			
			
			
			
			
			
		 		 			85_23 			
			
			
			
			
			
			
		 		 			110_33 			
			
			
			
			
			
			
		 		 			144_45 			
			
			
			
			
			
			
		 		 			154_25 			
			
			
			
			
			
			
		 		 			154_90

(I know most of the list above are not matches in file A and C but note line 110_3, this is only the very top of these files, file A and C should have around 9000 matches and yet the output from join only lists ~80 despite me being able to see matches it's missing).

I'm newish to linux but this has worked for me for ages with no problems. I've tried re-doing the lists, changing the orders, checking everything I could think of but getting nowhere. Does anyone have any suggestion as to what is going wrong? Or even if someone could verify that this is not something that I'm doing wrong and that this is some unusual error that would at least make me feel better!!

Thanks

Last edited by vbe; 08-17-2014 at 02:18 PM.. Reason: code tags please
# 2  
Old 08-17-2014
For join to work the content of the files need to be in standard sorted order..

From man join:

Quote:
When the default field delimiter characters are used, the files to be joined should be ordered in the collating sequence of sort(1),
using the -b option, on the fields on which they are to be joined, otherwise join may not report all field matches. When the field
delimiter characters are specified by the -t option, the collating sequence should be the same as sort(1) without the -b option.
# 3  
Old 08-17-2014
Thanks, I thought I had tried that but now it seems to be working so I must have done something wrong!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

UNIX Join not working as expected

Hello All, I'm working on a Shell script to join data from two files using Join command but not able to get the desired output as its throwing me an error: I have sorted the two files on the Column 1 numerically which is used as Join clause File 1: 1,ABC,GGG,20160401 2,XYZ,KKK,20160401... (2 Replies)
Discussion started by: venkat_reddy
2 Replies

2. Shell Programming and Scripting

Join not working properly

I want to join two files , with file 1 col 3 and file 2 col 1 as key. The join command is erratic for some reason. File 2 is a master file having all the names, and file 1 has some values. I want to add the names from fil2 in file 1. If I use the original master file, some output is missing. ... (16 Replies)
Discussion started by: ritakadm
16 Replies

3. Shell Programming and Scripting

Join not working for comparision

Hi All, I have 2 files where the first column of both the files have to be compared and if they match the first six columns of the first file to be extracted in the output file. Format of files : File1 : ${SHTEMP}NPBR5.XTR.tmp S00016678|129|7|MPF|20090106|E... (3 Replies)
Discussion started by: nua7
3 Replies

4. UNIX for Dummies Questions & Answers

A simple join, but nothing is working out for me

Guys, I want to join two files. You might have seen this many times. I just don't get the desired output. Searching the forum, No proper links :( Input: File1 test1 test2 test3 File2 is bad is not bad Output Needed: test1 is bad test2 is bad (4 Replies)
Discussion started by: PikK45
4 Replies

5. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

6. Shell Programming and Scripting

Bash join script not working

So i'm currently working on a project where I'm attempting to display information of users from the /etc/passwd file and also another information file holding addition information about users. Problem is I've been trying to join the two files together and have all of the information about each... (2 Replies)
Discussion started by: Nostyx
2 Replies

7. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

8. Programming

sql,multiple join,outer join issue

example sql: select a.a1,b.b1,c.c1,d.d1,e.e1 from a left outer join b on a.x=b.x left outer join c on b.y=c.y left outer join d on d.z=a.z inner join a.t=e.t I know how single outer or inner join works in sql. But I don't really understand when there are multiple of them. can... (0 Replies)
Discussion started by: robbiezr
0 Replies

9. Shell Programming and Scripting

Merging fields --- Join is not working

Hi GUYS sorry for putting simple query. I have tried the methods posted previously in this site but I'm unable to join the similar values in different columns of different files. I used sort -u file1 and join but no use.?? I'm attaching my inputfiles.Plz chek them I have two files. 1st file... (10 Replies)
Discussion started by: repinementer
10 Replies

10. UNIX for Dummies Questions & Answers

join not working

I was trying to merge the following two example files using their first field: join -1 1 -2 1 file1 file 2 but nothing is produced. The expected result should be: rs1005152 7 q21.3 3 It appears that the length of the first field in file1 is causing the problem. Any suggesting on how to... (12 Replies)
Discussion started by: gamma_user
12 Replies
Login or Register to Ask a Question