join not working


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers join not working
# 1  
Old 05-07-2008
join not working

I was trying to merge the following two example files using their first field:

join -1 1 -2 1 file1 file 2

but nothing is produced. The expected result should be:

rs1005152 7 q21.3 3

It appears that the length of the first field in file1 is causing the problem. Any suggesting on how to use join to produce the desired result? I know there are other ways to get the result but I have to use the join because the real files are huge.

Thanks in advance.


file1:
rs10051507 5 q21.3
rs10051514 5 p15.32
rs10051527 5 q21.2
rs1005152 7 q21.3
rs10051540 5 q21.3
rs10051548 5 q21.1
rs1005155 X q27.3
rs10051594 5 q34

file2:
rs1003456 3
rs1005152 3
# 2  
Old 05-07-2008
Hi.

The files should be sorted on the join field:
Quote:
file1 and file2 must be sorted in increasing collating
sequence as determined by LC_COLLATE on the fields on which
they are to be joined, normally the first in each line (see
sort(1)).
--excerpt from Solaris man join, q.v.
Best wishes ... cheers, drl
# 3  
Old 05-07-2008
Hi drl,

Both file1 and file2 have been sorted. Sorry for not making it clear. I also tried reverse sorting without success. Please help!
# 4  
Old 05-08-2008
Hi.

Here is a script that produces results for files not sorted, then sorted:
Code:
#!/usr/bin/env sh

# @(#) s1       Demonstrate join.

#  ____
# /
# |   Infrastructure BEGIN

echo
set -o nounset

debug=":"
debug="echo"

## The shebang using "env" line is designed for portability. For
#  higher security, use:
#
#  #!/bin/sh -

## Use local command version for the commands in this demonstration.

set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) sort join
set -o nounset

echo

FILE1=data1
echo " Input file $FILE1:"
cat $FILE1

echo
FILE2=data2
echo " Input file $FILE2:"
cat $FILE2

echo
echo " Results expected:"
cat expected-results

# |   Infrastructure END
# \
#  ---

echo
echo " Results from processing without a sort:"
join $FILE1 $FILE2

echo
echo " Results from processing after sorting files:"
sort $FILE1 >t1
sort $FILE2 >t2
join t1 t2

exit 0

Yielding:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
sort (coreutils) 5.2.1
join (coreutils) 5.2.1

 Input file data1:
rs10051507 5 q21.3
rs10051514 5 p15.32
rs10051527 5 q21.2
rs1005152 7 q21.3
rs10051540 5 q21.3
rs10051548 5 q21.1
rs1005155 X q27.3
rs10051594 5 q34

 Input file data2:
rs1003456 3
rs1005152 3

 Results expected:
rs1005152 7 q21.3 3

 Results from processing without a sort:

 Results from processing after sorting files:
rs1005152 7 q21.3 3

cheers, drl
# 5  
Old 05-08-2008
drl,

That's odd. I got the expected result when I ran your whole script. However when I ran each step of the script, join still did not work. Here is what I did:

$ cat data1
rs10051507 5 q21.3
rs10051514 5 p15.32
rs10051527 5 q21.2
rs1005152 7 q21.3
rs10051540 5 q21.3
rs10051548 5 q21.1
rs1005155 X q27.3
rs10051594 5 q34

$ cat data2
rs1003456 3
rs1005152 3

$ sort data1 >t1

$ sort data2>t2

$ join t1 t2

Why is that?
# 6  
Old 05-08-2008
Hi.

The first thing that comes to mind is the locale. What do you get from:
Code:
echo " LC_ALL = $LC_ALL"
echo " LANG   = $LANG"

I get the expected results when the variables LC_ALL and LANG are set to "C", or not set at all. I did not try other locales.

The results on Linux and Solaris were identical. On Solaris, I was successful using sh, bash and ksh. On Linux, sh is processed by bash.

What version of UNIX are you using?

What shell are you using for scripting and interactively? ... cheers, drl
# 7  
Old 05-08-2008
drl:

Thanks a lot for your help and patience so far. The answers to your questions are as follows:

[chenz@beta binned]$ echo " LC_ALL = $LC_ALL"
LC_ALL =

[chenz@beta binned]$ echo " LANG = $LANG"
LANG = en_US.UTF-8

[chenz@beta binned]$ uname -r
2.6.18-53.1.14.el5

[chenz@beta binned]$ echo $SHELL
/bin/bash

Will these shed some lights on the problem I have?

Best regards
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

UNIX Join not working as expected

Hello All, I'm working on a Shell script to join data from two files using Join command but not able to get the desired output as its throwing me an error: I have sorted the two files on the Column 1 numerically which is used as Join clause File 1: 1,ABC,GGG,20160401 2,XYZ,KKK,20160401... (2 Replies)
Discussion started by: venkat_reddy
2 Replies

2. UNIX for Dummies Questions & Answers

Join not working

Hi all, I'm trying to use the join command to merge two files, but it's not finding lots of the matches. I have three files in total: File A: 31_77 34_46 72_61 85_10 85_23 110_33 144_45 154_25 154_90 170_5 170_44 217_63 255_19 333_20 333_23 333_32 (2 Replies)
Discussion started by: HEP
2 Replies

3. Shell Programming and Scripting

Join not working properly

I want to join two files , with file 1 col 3 and file 2 col 1 as key. The join command is erratic for some reason. File 2 is a master file having all the names, and file 1 has some values. I want to add the names from fil2 in file 1. If I use the original master file, some output is missing. ... (16 Replies)
Discussion started by: ritakadm
16 Replies

4. Shell Programming and Scripting

Join not working for comparision

Hi All, I have 2 files where the first column of both the files have to be compared and if they match the first six columns of the first file to be extracted in the output file. Format of files : File1 : ${SHTEMP}NPBR5.XTR.tmp S00016678|129|7|MPF|20090106|E... (3 Replies)
Discussion started by: nua7
3 Replies

5. UNIX for Dummies Questions & Answers

A simple join, but nothing is working out for me

Guys, I want to join two files. You might have seen this many times. I just don't get the desired output. Searching the forum, No proper links :( Input: File1 test1 test2 test3 File2 is bad is not bad Output Needed: test1 is bad test2 is bad (4 Replies)
Discussion started by: PikK45
4 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

Bash join script not working

So i'm currently working on a project where I'm attempting to display information of users from the /etc/passwd file and also another information file holding addition information about users. Problem is I've been trying to join the two files together and have all of the information about each... (2 Replies)
Discussion started by: Nostyx
2 Replies

8. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

9. Programming

sql,multiple join,outer join issue

example sql: select a.a1,b.b1,c.c1,d.d1,e.e1 from a left outer join b on a.x=b.x left outer join c on b.y=c.y left outer join d on d.z=a.z inner join a.t=e.t I know how single outer or inner join works in sql. But I don't really understand when there are multiple of them. can... (0 Replies)
Discussion started by: robbiezr
0 Replies

10. Shell Programming and Scripting

Merging fields --- Join is not working

Hi GUYS sorry for putting simple query. I have tried the methods posted previously in this site but I'm unable to join the similar values in different columns of different files. I used sort -u file1 and join but no use.?? I'm attaching my inputfiles.Plz chek them I have two files. 1st file... (10 Replies)
Discussion started by: repinementer
10 Replies
Login or Register to Ask a Question