Sponsored Content
Full Discussion: Problem with Join Command
Top Forums Shell Programming and Scripting Problem with Join Command Post 302979454 by Don Cragun on Monday 15th of August 2016 05:47:34 AM
Old 08-15-2016
Quote:
Originally Posted by Varshha
I posted the reply yesterday but I am not sure why it is not reflecting. So here it is again :

The files were originally tab delimited but I made them comma delimited to help me with the join command. I am now trying again with tab delimited files. I also tried some additions in my join command and this is what I gave :

Code:
join -t"  "  -a 2 -a 1 -e 'NULL' -o '0,1.1,1.2,2.1,2.2' File1 File2 | head -100

I am getting a result out of this which unfortunately means that UNIX is not finding a common key for the files to join and it is surprising because there ARE common values between the files. This is how the sample of the result looks like :

Code:
01635158332	09/09/2016 01635158332	09/09/2016 NULL NULL NULL
01635163349	11/24/2009 01635163349	11/24/2009 NULL NULL NULL
16.11	01635163339 NULL NULL 16.11	01635163339 NULL
16.11	01635163349 NULL NULL 16.11	01635163349 NULL


As you can see above, 01635163349 is a common key between File 1 that has dates and file 2 that has the cost. So ideally the result should be

Code:
01635163349  11/24/2009  16.11

The command
Code:
join -1 1 -2 1 File 1 File 2

does not give me any result as in no output on the console at all.

This is how file 1 looks:

Code:
00033492482     04/11/2006
00033492682     07/14/2009
00033492702     02/09/2010
00076848302     08/10/2010
00881123792     11/07/2000
01130162424     06/12/2007
01130164254     01/29/2008
01130165543     05/16/2011
01130168864     07/14/2009
01635163349     11/24/2009

File 2:

Code:
0.00    03139822826
0.00    49246820001
0.00    7621830148
0.00    822004599003
0.11    73379268872
0.64    67119603398
0.65    67261704102
16.11   01635163349


Can there be any other way to achieve an inner join between these files?
You have now shown us 3 different input file formats (tab separated fields, <space><comma><space> separated fields, and <space><comma> separated fields). You have shown us commands using <space>, <comma>, and <tab> as the field separator. And it isn't clear which separators have been used in the files those command are processing.

More importantly, you have said that your file names are File 1, File1, file 1, and file1. Since none of your commands have quoted the filenames being passed as arguments, many of them are asking various utilities to work on files named File or file and 1 and 2 (which presumably result in non-existent file diagnostics that you haven't shown us). The name of a file is case sensitive and having a <space> in a filename requires special handling in LOTS of ways that are being ignored in all of your command lines.

Then, it is also important to understand that in an awk script, $0 is the contents of the current input line, $2 is the contents of the 2nd field in the current input line, and a command like:
Code:
awk 'NR=FNR{check[$0];next} $2 in check' File2 File1

is never going to work unless
Code:
File2

contains line that just contain whole lines that exactly match the 2nd field of a line in File1 (which is not true for any of your sample input file pairs.

And, the command line:
Code:
cat File2 | while read line; do  grep $line File1; done

will only work correctly if there are no <space> or <tab> characters on any line in File2 AND you are trying to find complete lines form File2 that match a subset of a line from File1.

And, the command line:
Code:
join file 2 file1

should give you a diagnostic similar to:
Code:
usage: join [-a file no | -v file no ] [-e string] [-1 field] [-2 field]
            [-o list] [-t char] file1 file2

not the no output that you say you get.

If you keep giving us inconsistent data and don't show us what your command lines and/or the output you get from them really are, you make it impossible for us to help you.

Saying things like:
Quote:
Came out as a typo .... but this is not working either Smilie
Doesn't help us. Show us the exact diagnostic that was produced!

Saying things like:
Quote:
These files are being sent by the source. There are many other columns in these files. I have manipulated them to remove the unrequired columns and the header using AWK and SED.
Doesn't give us any indication as to whether or not we are working on UNIX format text files after you have manipulated files sent by the source. If, after have manipulated them, the source files are still DOS format text files, there is a good chance that fields are matching because of DOS text file <carriage-return> line separators causing <carriage-return> characters to keep fields from matching or to cause output sent to your terminal being obscured by parts of output lines overwriting earlier text already sent to your screen.

Please give us clear answers to the questions we have asked. We are asking for information that will allow us to help you. We are not asking you to do extra work for the fun of it.

Please help us help you!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

A join problem?

Hi everybody, I am hoping somebody here will be either be able to solve my troubles or at least give me a push in the right direction :) . I am developing a shell script to read in 4 different files worth of data that each contain a list of: username firstname secondname group score I... (2 Replies)
Discussion started by: jamjamjammie
2 Replies

2. Shell Programming and Scripting

join (pls help on join command)

Hi, I am a new learner of join command. Some result really make me confused. Please kindly help me. input: file1: LEO oracle engineer 210375 P.Jones Office Runner ID897 L.Clip Personl Chief ID982 S.Round UNIX admin ID6 file2: Dept2C ID897 6 years Dept5Z ID982 1 year Dept3S ID6 2... (1 Reply)
Discussion started by: summer_cherry
1 Replies

3. Shell Programming and Scripting

Problem with Join command

Hi guyz Excuse me for posting simple question I tried join and sort and other perl commands but failed I have 2 files. 1st file contain single column with around 6000 values (rows). Second file contain 2 columns 1st column is the same column (in 1st file) but randomly ordered and second... (5 Replies)
Discussion started by: repinementer
5 Replies

4. Shell Programming and Scripting

awk command for simple join command but based on 2 columns

input1 a_a a/a 10 100 a1 a_a 20 200 b1 b_b 30 300 input2 a_a a/a xxx yyy a1 a1 lll ppp b1 b_b kkk ooo output a_a a/a 10 100 xxx yyy (2 Replies)
Discussion started by: ruby_sgp
2 Replies

5. UNIX for Dummies Questions & Answers

SOLVED: Join problem

Hello, Going through book, "Guide to UNIX Using Linux". I am doing one of the projects that has me writing scripts to join files. Here is my pnumname script and I am extracting the programmers names and numbers from the program file and redirecting the output to the file pnn. I then created a... (0 Replies)
Discussion started by: thebeav
0 Replies

6. UNIX for Dummies Questions & Answers

problem with join

So I want to join two files that have a lot of rows The file named gen1 has 2 columns: head gen1 1008567 0.4026931012 1119535 0.7088912314 1120590 0.7093805634 1145994 0.7287952590 1148140 0.7313924434 1155173 0.7359550430 1188481 0.7598914553 1201155 0.7663406553 1206921... (2 Replies)
Discussion started by: peanuts48
2 Replies

7. UNIX for Dummies Questions & Answers

Problem when using join command

Dear all, I have two files (each only contains 1 column) as attached. I want to combined the two files and only show the common records in both files. But when I use join command only the last row was combined. Anyone know what is the problem? I don't know how to write the correct code to only... (2 Replies)
Discussion started by: forevertl
2 Replies

8. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

9. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

10. UNIX for Dummies Questions & Answers

Weird problem with join command

I have a weird issue going on with the join command... I have two files I am trying to join...here is a line from each file with the important parts marked in red: file1: /groupspace/ccops/cmis/bauwkrcn/commsamp_20140315.txt,1 file2:... (3 Replies)
Discussion started by: dbiggied
3 Replies
All times are GMT -4. The time now is 02:00 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy