using join command for files


 
Thread Tools Search this Thread
Special Forums Hardware Filesystems, Disks and Memory using join command for files
# 8  
Old 08-13-2002
Hi Subra,

could you post back the content of the files you like to join.

TIA

Frank
# 9  
Old 08-13-2002
You can acheive this with a combination of awk and join.

i.e.
File_one has 4 columns "first name, lastname, age, height"
File_two has 3 columns "first name, lastname, weight"

You want the output to have "first name, lastname, age, height, weight" - for only those records with a weight present - and your key is first and last name.

Code:
awk '
BEGIN { OFS = "," } { print $1 $2 , $1, $2, $3 ,$4 } 
' File_one > temp_File_one 
#
awk '
BEGIN { OFS = "," } { print $1 $2 , $1, $2, $3 } 
' File_two> temp_File_two
#
join -t ","  -j1 1 -j2 1 -o 1.2 1.3 1.4 1.5 2.4  temp_File_one temp_File_two > File_three

# 10  
Old 08-14-2002
Peter,

The above code is working perfectly...

I have heard of IFS(Internal Field Seperator)... Could you please tell me what OFS is????

And also i am new to this join command....

can u explain this to me....

join -t "," -j1 1 -j2 1 -o 1.2 1.3 1.4 1.5 2.4 temp_File_one temp_File_two

The join command works even like this..

join -t "," -o 1.2 1.3 1.4 1.5 2.4 temp_File_one temp_File_two


A few other questions...

1. Why do i need two temp files at all??
2. Why can't I directly join the file_one and file_two?
3. Why should there be a statement like this ?
BEGIN { OFS = "," } { print $1 $2, $1, $2, $3 ,$4 }
instead of
BEGIN { OFS = "," } { print $1, $2, $3 ,$4 }

Thanks,
Nisha

Last edited by Nisha; 08-14-2002 at 03:19 AM..
# 11  
Old 08-14-2002
Hi Nisha,

one of your other questions I think I´m able to answer.

2. Why can't I directly join the file_one and file_two?

You can Join them directly but then you can only join them by one column.

So what Peter did was to concatenate the "firstname" & "lastname" column for both files

(And this explains the first of your other questions
1. Why do i need two temp files at all?? )

The third questionI ´ll try to answer as well
3. Why should there be a statement like this ?
BEGIN { OFS = "," } { print $1 $2, $1, $2, $3 ,$4 }

this lines output will be

BEGIN { OFS = "," } { print $1, $2, $3 ,$4 }

|firstnamelastname|firstname|lastname|age|hight|

After that he was able to join those both files based on two columns, whats not exactly correct, cause he joined them based on one column in what the keycolumns has benn concatenated.


the Output of the second statement is
|firstname|lastname|age|hight|

Did I get that right?
# 12  
Old 08-14-2002
Directly from the man page

OFS output field separator (default blank)

The reason for the { print $1 $2, $1,$2,$3,$4 } and the reason for having 2 temp files is the same. Join will only allow you to join on one field. In this instance we want to join on two fields.

So we make a composite key by concatenating the two fields into one string and apending that at the start of the each line - we do this for both files.

Then we can join using one field - being the composite key.

i.e. To join where column 1 and 5 are the same in both files

File1 before is
"aaa,bbb,111,222,333".
"aaa,bbb,111,222,xxx".

File2 before is "
"aaa,cat,dog,hen,333"
"123,345,234,123,345"

The files with the {print $1 $5, $0} (you know what $0 is too...)
become

File1 before is
"aaa333,aaa,bbb,111,222,333".
"aaaxxx,aaa,bbb,111,222,xxx".

File2 before is "
"aaa333,aaa,cat,dog,hen,333"
"123345,123,345,234,123,345"

So we have the composite key matching the aaa333 in both files - and we can join on this with and make sure in our output there's not 1.1 or 2.1 (which would be the first fields from each file - as they are the fields we created and don't want to see again)

Just seen your reply while entering mine isacs!
# 13  
Old 08-14-2002
Thanks Peter and Isacs for the replies...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies

2. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

3. UNIX for Dummies Questions & Answers

How to use the join command to obtain tab delimited text files as an output?

How do you use the join command and obtain tab delimited text files as an output? Thanks! (2 Replies)
Discussion started by: evelibertine
2 Replies

4. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

5. Shell Programming and Scripting

join files cisco command output

We have two files with results of two commands of cisco: one file; showintstatus, have the following format: Gi2/6 servidorlij connected 176 full 1000 10/100/1000BaseTsecond file; showmacaddrestable, have the following format: * 162 0021.9b8d.073e dynamic Yes 0 ... (1 Reply)
Discussion started by: robonet
1 Replies

6. Shell Programming and Scripting

awk command for simple join command but based on 2 columns

input1 a_a a/a 10 100 a1 a_a 20 200 b1 b_b 30 300 input2 a_a a/a xxx yyy a1 a1 lll ppp b1 b_b kkk ooo output a_a a/a 10 100 xxx yyy (2 Replies)
Discussion started by: ruby_sgp
2 Replies

7. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

8. Shell Programming and Scripting

join (pls help on join command)

Hi, I am a new learner of join command. Some result really make me confused. Please kindly help me. input: file1: LEO oracle engineer 210375 P.Jones Office Runner ID897 L.Clip Personl Chief ID982 S.Round UNIX admin ID6 file2: Dept2C ID897 6 years Dept5Z ID982 1 year Dept3S ID6 2... (1 Reply)
Discussion started by: summer_cherry
1 Replies

9. OS X (Apple)

Command line tool to join multiple .wmv files?

I need a simple command line executable that allows me to join many wmv files into one output wmv file, preferrably in a simple way like this: wmvjoin file1.wmv file2.wmv .... > outputfile.wmv So what I want is the wmv-equivalent of mpgtx I cannot find it on internet. Thanks. (2 Replies)
Discussion started by: karman
2 Replies

10. Shell Programming and Scripting

Join Command Help

Hi, I am trying to use join command for two files of size greater than 1 GB. join -t , -1 2 -2 1 -o 1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,1.10,1.11,1.12,1.13,1.14,1.15,1.16,1.17,1.18,1.19,1.20,1.21,1.22,1.23 File1 File2 > File3 we are facing space crunch after using these join command. May i... (0 Replies)
Discussion started by: jerome Sukumar
0 Replies
Login or Register to Ask a Question