The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM
Home Forums Register Rules & FAQ Members List Arcade Search Today's Posts Mark Forums Read


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!


Other UNIX.COM Threads You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk, join or sed jkl_jkl Shell Programming and Scripting 1 04-15-2008 02:55 AM
Join jazz8146 UNIX for Dummies Questions & Answers 5 01-29-2008 07:42 AM
join (pls help on join command) summer_cherry Shell Programming and Scripting 1 12-31-2007 01:19 AM
Use non alphanumerics in join s0460205 UNIX for Dummies Questions & Answers 1 12-16-2005 03:03 AM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 05-07-2008
Registered User
 

Join Date: May 2008
Posts: 7
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiReddit! Stumble this Post!Spurl this Post!
join not working

I was trying to merge the following two example files using their first field:

join -1 1 -2 1 file1 file 2

but nothing is produced. The expected result should be:

rs1005152 7 q21.3 3

It appears that the length of the first field in file1 is causing the problem. Any suggesting on how to use join to produce the desired result? I know there are other ways to get the result but I have to use the join because the real files are huge.

Thanks in advance.


file1:
rs10051507 5 q21.3
rs10051514 5 p15.32
rs10051527 5 q21.2
rs1005152 7 q21.3
rs10051540 5 q21.3
rs10051548 5 q21.1
rs1005155 X q27.3
rs10051594 5 q34

file2:
rs1003456 3
rs1005152 3
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 05-07-2008
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 433
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiReddit! Stumble this Post!Spurl this Post!
Hi.

The files should be sorted on the join field:
Quote:
file1 and file2 must be sorted in increasing collating
sequence as determined by LC_COLLATE on the fields on which
they are to be joined, normally the first in each line (see
sort(1)).
--excerpt from Solaris man join, q.v.
Best wishes ... cheers, drl
Reply With Quote
  #3 (permalink)  
Old 05-07-2008
Registered User
 

Join Date: May 2008
Posts: 7
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiReddit! Stumble this Post!Spurl this Post!
Hi drl,

Both file1 and file2 have been sorted. Sorry for not making it clear. I also tried reverse sorting without success. Please help!
Reply With Quote
  #4 (permalink)  
Old 05-07-2008
drl's Avatar
drl drl is offline
Registered User
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 433
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiReddit! Stumble this Post!Spurl this Post!
Hi.

Here is a script that produces results for files not sorted, then sorted:
Code:
#!/usr/bin/env sh

# @(#) s1       Demonstrate join.

#  ____
# /
# |   Infrastructure BEGIN

echo
set -o nounset

debug=":"
debug="echo"

## The shebang using "env" line is designed for portability. For
#  higher security, use:
#
#  #!/bin/sh -

## Use local command version for the commands in this demonstration.

set +o nounset
LC_ALL=C ; LANG=C ; export LC_ALL LANG
echo "Environment: LC_ALL = $LC_ALL, LANG = $LANG"
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) sort join
set -o nounset

echo

FILE1=data1
echo " Input file $FILE1:"
cat $FILE1

echo
FILE2=data2
echo " Input file $FILE2:"
cat $FILE2

echo
echo " Results expected:"
cat expected-results

# |   Infrastructure END
# \
#  ---

echo
echo " Results from processing without a sort:"
join $FILE1 $FILE2

echo
echo " Results from processing after sorting files:"
sort $FILE1 >t1
sort $FILE2 >t2
join t1 t2

exit 0
Yielding:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
sort (coreutils) 5.2.1
join (coreutils) 5.2.1

 Input file data1:
rs10051507 5 q21.3
rs10051514 5 p15.32
rs10051527 5 q21.2
rs1005152 7 q21.3
rs10051540 5 q21.3
rs10051548 5 q21.1
rs1005155 X q27.3
rs10051594 5 q34

 Input file data2:
rs1003456 3
rs1005152 3

 Results expected:
rs1005152 7 q21.3 3

 Results from processing without a sort:

 Results from processing after sorting files:
rs1005152 7 q21.3 3
cheers, drl
Reply With Quote
  #5 (permalink)  
Old 05-08-2008
Registered User
 

Join Date: May 2008
Posts: 7
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiReddit! Stumble this Post!Spurl this Post!
drl,

That's odd. I got the expected result when I ran your whole script. However when I ran each step of the script, join still did not work. Here is what I did:

$ cat data1
rs10051507 5 q21.3
rs10051514 5 p15.32
rs10051527 5 q21.2
rs1005152 7 q21.3
rs10051540 5 q21.3
rs10051548 5 q21.1
rs1005155 X q27.3
rs10051594 5 q34

$ cat data2
rs1003456 3
rs1005152 3

$ sort data1 >t1

$ sort data2>t2

$ join t1 t2

Why is that?
Reply With Quote
Google UNIX.COM
Reply



Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -7. The time now is 09:08 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger

Search Engine Optimization by vBSEO 3.1.0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102