Join - files of different sizes


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Join - files of different sizes
# 1  
Old 01-23-2008
Join - files of different sizes

I am trying to join to files with dramatically different sizes (file 1: 1 column - 9000 rows, file 2: 13 cols, 26 million rows). I can't seem to get join to work. I have check to ensure that there are matches and have tried subsamples that work. The -a filenum flag lists all the rows, so join is indeed doing something. Is there a reason that disproportionate files will not join?
Cheers!
# 2  
Old 01-23-2008
If both files are sorted in the same collating sequence and the same direction for the key fields, you do get output regardless of a disparity in size.

If there is no match for an item in file1, you get no output.

Is there a possibility of producing cartesian products in the result set?
What status code does join return?
Code:
join file1 file2 
......
echo $?

# 3  
Old 01-23-2008
Join - files of different sizes

Right, so my files are sorted in (numerical ) order on the field that is being joined (1). The result of the join is 0. That is:
Code:
$ join -1 1 -2 1 File1 File2
$ echo $?
0

It seems that the file size difference is responsible, but I can't explain why. Or perhaps there is some other reason.
Cheers!
# 4  
Old 01-27-2008
Bug join

For successful execution of join commd make sure the length of the
field are same on which you want to join both the file
say for exmp . if the first file is having the lenght 8 digit make sure the
second file too have 8 digit for that use %.8d hence the field will have
eight digit and if any particular record have six digit that will be read like
00123456 and like wise and you can join both the file regardless of the
numbers of records. You may give example for more details. I hope I have
made it clear .
# 5  
Old 01-27-2008
Hi.

If you are still having trouble, then post the smallest sample -- including data, commands, and results -- that fails for you so that we can provide specific help ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies

2. Shell Programming and Scripting

To merge different sizes txt files

Hi, I have to .txt files that look like "baseMean" "log2FoldChange" "lfcSE" "stat" "pvalue" "padj" "c104215_g2_i4" 202.057864855455 5.74047973414006 1.14052672909697 5.03318299141063 4.8240223910525e-07 0.00234905721174879 "c91544_g1_i1" 373.123487095726 5.62496675850204 1.15060014539303... (2 Replies)
Discussion started by: alisrpp
2 Replies

3. Shell Programming and Scripting

How to merge some files with diffrent sizes into one excel file using shell?

Hii I have these files , and I want to merge them in an excel file each file have two columns file1 title1 1 1 2 2 3 3 file2 title2 5 5 6 6 7 7 8 8 9 9 (10 Replies)
Discussion started by: maryre89
10 Replies

4. Shell Programming and Scripting

Listing Files and Sizes on FTP server

Need assistance in getting File size for the List of files using perl script . I have writtern 2 codes. One of them gives me the list of files and 2nd one give me the size for only 1 file. I dont know how to club both of them to get the list of files with its size . #!/usr/bin/perl -w... (11 Replies)
Discussion started by: ajayram_arya
11 Replies

5. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

6. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

7. Shell Programming and Scripting

Comparing sizes in percentages of 2 files in bash

Hi guys, I hope you can enlight me with a script I'm doing for Solaris 10. Script goes like this: #!/usr/bin/bash fechahoy=`perl /export/home/info/John/fechamod.pl` fechayer=`perl /export/home/info/John/fecha.pl` echo $fechahoy echo $fechayer DAT1=`ssh ivt@blahblah ls -la... (1 Reply)
Discussion started by: sr00t
1 Replies

8. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

9. Shell Programming and Scripting

Search directories for files with zero sizes to delete

Hi, I am trying to write a script that will use ls on a directory and list the files one at a time and their size. If the size is 0 i want it to ask me if I want to delete it (yes or no). If I say yes, I want it to delete but it won't know what the file name is just from running from the script.... (2 Replies)
Discussion started by: akeenabawa
2 Replies

10. Shell Programming and Scripting

Sum of Files Sizes starting with a letter...

Can we find some of size of all files in a directry where file names start with an letter t* the out put of ls -ls t* is 4 -rw-r--r-- 1 root system 61 Jul 03 10:56 t 4 -rw-r--r-- 1 root system 3146 Jul 19 11:11 t1 4 -rw-r--r-- 1 root system ... (2 Replies)
Discussion started by: pbsrinivas
2 Replies
Login or Register to Ask a Question