Sponsored Content
Full Discussion: Join not working properly
Top Forums Shell Programming and Scripting Join not working properly Post 302879640 by ritakadm on Friday 13th of December 2013 12:58:09 PM
Old 12-13-2013
Join not working properly

I want to join two files , with file 1 col 3 and file 2 col 1 as key.
The join command is erratic for some reason. File 2 is a master file having all the names, and file 1 has some values. I want to add the names from fil2 in file 1. If I use the original master file, some output is missing.

For example Medtr1g004990 exists in the master file but does not come in the output.

However, if just use a truncated master file that has 5 records including Medtr1g004990, the output seems to be correct.

I have used sorted files also , same problem. Please help solve this, if join doesnt work,please let me know if some similar command to join would.

I have attached the original master file and pasted the truncated one.



File 1
Code:
# more Medtr1g006600.exp
XLOC_000005     XLOC_000005     Medtr1g004990   chr1:35909-40554        q1      q2      OK      0.520378        6.91484 3.73206 6.85797 5e-05   0.000126299     yes     down
XLOC_000006     XLOC_000006     Medtr1g006490   chr1:44429-46280        q1      q2      OK      16.1083 122.606 2.92814 10.2969 5e-05   0.000126299     yes     down
XLOC_000008     XLOC_000008     Medtr1g006600   chr1:51360-54977        q1      q2      OK      6.94505 3.84361 -0.853525       -2.49824        0.0001  0.000244358     yes     up
XLOC_000010     XLOC_000010     Medtr1g006660   chr1:70777-71741        q1      q2      OK      1.15476 2.47771 1.10142 2.07776 0.0045  0.00841718      yes     down
XLOC_000014     XLOC_000014     Medtr1g006975   chr1:129007-136403      q1      q2      OK      0.389401        0.166262        -1.2278 -2.00092        0.0017  0.00343409      yes     up

File 2 (truncated from attached file)
Code:
# more Medtr1g006600.annot
Medtr1g004990   casein kinase
Medtr1g006490   major intrinsic protein %28MIP%29 family transporter
Medtr1g006590   tonoplast intrinsic protein
Medtr1g006600   exostosin family protein
Medtr1g006605   hypothetical protein
Medtr1g006660   AP2 domain class transcription factor

Command and output with master file

Code:
 # join -a1 -1 3 -2 1  Medtr1g006600.exp mt4.genenames.txt
Medtr1g004990 XLOC_000005 XLOC_000005 chr1:35909-40554 q1 q2 OK 0.520378 6.91484 3.73206 6.85797 5e-05 0.000126299 yes down
Medtr1g006490 XLOC_000006 XLOC_000006 chr1:44429-46280 q1 q2 OK 16.1083 122.606 2.92814 10.2969 5e-05 0.000126299 yes down
Medtr1g006600 XLOC_000008 XLOC_000008 chr1:51360-54977 q1 q2 OK 6.94505 3.84361 -0.853525 -2.49824 0.0001 0.000244358 yes up exostosin family protein
Medtr1g006660 XLOC_000010 XLOC_000010 chr1:70777-71741 q1 q2 OK 1.15476 2.47771 1.10142 2.07776 0.0045 0.00841718 yes down AP2 domain class transcription factor
Medtr1g006975 XLOC_000014 XLOC_000014 chr1:129007-136403 q1 q2 OK 0.389401 0.166262 -1.2278 -2.00092 0.0017 0.00343409 yes up disease resistance protein %28CC-NBS-LRR class%29 family protein

Command and output with truncated file

Code:
# join -a1 -1 3 -2 1  Medtr1g006600.exp Medtr1g006600.annot 
Medtr1g004990 XLOC_000005 XLOC_000005 chr1:35909-40554 q1 q2 OK 0.520378 6.91484 3.73206 6.85797 5e-05 0.000126299 yes down casein kinase
Medtr1g006490 XLOC_000006 XLOC_000006 chr1:44429-46280 q1 q2 OK 16.1083 122.606 2.92814 10.2969 5e-05 0.000126299 yes down major intrinsic protein %28MIP%29 family transporter
Medtr1g006600 XLOC_000008 XLOC_000008 chr1:51360-54977 q1 q2 OK 6.94505 3.84361 -0.853525 -2.49824 0.0001 0.000244358 yes up exostosin family protein
Medtr1g006660 XLOC_000010 XLOC_000010 chr1:70777-71741 q1 q2 OK 1.15476 2.47771 1.10142 2.07776 0.0045 0.00841718 yes down AP2 domain class transcription factor
Medtr1g006975 XLOC_000014 XLOC_000014 chr1:129007-136403 q1 q2 OK 0.389401 0.166262 -1.2278 -2.00092 0.0017 0.00343409 yes up

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Keyboard not working properly...

Hello Again, Those that have noticed my earlier posts will know that I have succesfully installed Solaris 8 onto my pc. I haven't been able to get x-server working (i think it doesn't like my video card) though I've been able to log into root (with a bit of help from unix forums :o ) and have... (2 Replies)
Discussion started by: timresh
2 Replies

2. Programming

y is this not working properly?

#include <stdio.h> #include <sys/types.h> #include <string.h> #include <sys/stat.h> #include <unistd.h> struct stat s; main() { char c; if (fork()==0) { system("clear"); do { printf("myAI\\>§ "); scanf("%s",c); if(stat(c,&s)>-1) {... (3 Replies)
Discussion started by: C|[anti-trust]
3 Replies

3. HP-UX

FC card not working properly

Hi I've a problem with Hp-ux 11.11 9000/800/rp3440 system. Already the software for driver & its patch are loaded for HBA Fibrechannel card, but still the fibrechannel card is showing the status "Unclaimed" . What will be reason for this? How to get the status "Claimed" ? Pl. help me out.... (4 Replies)
Discussion started by: Mike1234
4 Replies

4. Shell Programming and Scripting

\n not working properly

Hi all, I'm trying to generate a series of txt files starting from a plain csv file part of my code: #!/bin/ksh INSTALLDIR=/Users/ME/Installdir CSV=CSV.csv TMP=/tmp/$(basename $0).txt tr -s "\r" "\n" < /$INSTALLDIR/$CSV > $TMP function Makefiles { printf '%24s:%30s\n' "sometext"... (1 Reply)
Discussion started by: Jive Spector
1 Replies

5. UNIX for Advanced & Expert Users

Sendmail is not working properly

Hi All, Can any one help me to solve the issue. The Issue is, i have started the sendmail service on my RHEL 4 update 6 box, I am able to send the mail from my box to almost all of the Email Id's except few. Exampe, test mail. . Output is :the message is sent. now if I send the... (2 Replies)
Discussion started by: akhtar.bhat
2 Replies

6. Shell Programming and Scripting

mailx not working properly

I am using mailx command in my script to attach a file and send an email. I need to attach a csv file and send email to a mail id - I am using uuencode output.csv output.csv | mailx -s "test mail" xyz@abc.com This will send a mail with scrambled text in body. am i missing something ?... (4 Replies)
Discussion started by: Sriranga
4 Replies

7. Linux

rexec not working properly

Hi, I am trying to enable rexec to automate certain tasks(it has to be rexec, not ssh or any other due to the system environment), so after switching to linux, I followed the certain instructions that were laid out in the web. My operating system is fedora 17, so I first installed the... (1 Reply)
Discussion started by: wringer
1 Replies

8. UNIX for Dummies Questions & Answers

~c is not working properly with -r option

Hi There, --------- file1 ------- ~c asd@ac.com -------------- Now i am using below command cat file1|mailx -s " testing" -r " My Name" abc@tech.com (3 Replies)
Discussion started by: Tapan Sharma
3 Replies

9. Shell Programming and Scripting

Why is sort not working properly here ?

Platform: RHEL 5.4 In the below text file I have strings like following. $ cat /tmp/mytextfile.txt DISK1 DISK10 DISK101 DISK102 DISK103 DISK104 DISK105 DISK106 DISK107 DISK108 DISK109 DISK110 DISK111 DISK112 DISK113 DISK114 (8 Replies)
Discussion started by: kraljic
8 Replies

10. Shell Programming and Scripting

Expansion not working properly

I'm using an Ubuntu machine and expansion is not working properly. What would cause this? Do I need to check for any particular bash packages? $ ipcs -m | grep $USER | awk '{printf "%s ",$2}' $ ipcs -m | grep UNF | awk '{printf "%s ",$2}' 294912 1048577 425986 688131 786436 1245189... (14 Replies)
Discussion started by: cokedude
14 Replies
All times are GMT -4. The time now is 04:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy