Join help


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Join help
# 1  
Old 04-30-2010
Join help

im trying to join 3 files together


Code:
sort -n ${ASS1_DATA_DIR}Prac1 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac1temp
sort -n ${ASS1_DATA_DIR}Prac2 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac2temp
join -a1 -a2 -1 1 -2 1 -o "0 1.2 2.2" -e " " ${ASS1_OUTPUT_DIR}Prac1temp ${ASS1_OUTPUT_DIR}Prac2temp > ${ASS1_OUTPUT_DIR}part1temp1
#use temp file.
sort -n ${ASS1_DATA_DIR}Prac3 | sed '/^#/d'> ${ASS1_OUTPUT_DIR}part1temp2
join -a1 -a2 -1 1 -2 1 -o "0 1.2 1.3 2.2" -e " " ${ASS1_OUTPUT_DIR}part1temp1 ${ASS1_OUTPUT_DIR}part1temp2 > ${ASS1_OUTPUT_DIR}Marks

However when i submit it to the system I'm getting what appears to be a spacing problem.
System output
Code:
File: Marks
Expected: 26825845 4 8
Got: 26825845 4   8

prac1
Code:
# Results for Prac1. The format is:
# 12345678 10
32145645 4
12345678 7
321435355 3
4325325 3
54359506 5
4325459 7
87686546 8
90948695 4

prac2
Code:
# Results for Prac2. The format is:
# 12345678 3
12345678
32145645 3
87686546 9
4325459 5
90948695 8

prac3
Code:
# Results for Prac3. The format is:
# 12345678 3
12345678 1
87686546 8
32145645 7
54359506 4
4325325 3
90948695 7

part1temp1
Code:
4325325 3  
4325459 7 5
12345678 7  
32145645 4 3
54359506 5  
87686546 8 9
90948695 4 8
321435355 3

part1temp2
Code:
4325325 3
12345678 1
32145645 7
54359506 4
87686546 8
90948695 7

Marks
Code:
4325325 3   3
12345678     1
32145645     7
4325459 7 5  
12345678 7    
32145645 4 3  
54359506 5   4
87686546 8 9 8
90948695 4 8 7
321435355 3

not only do i have a spacing problem but some of the id's are not joining properly such as 12345678

---------- Post updated at 06:54 AM ---------- Previous update was at 06:38 AM ----------

im using the exact same code on my macbook and it's doing it properly. both using #!/bin/sh

---------- Post updated at 06:59 AM ---------- Previous update was at 06:54 AM ----------

ah nvm it wasn't hte exact same. i didn't sort on my other one.

still isn't right though..

Code:
12345678 7   1
321435355 3    
32145645 4   7
4325325 3   3
4325459 7 5  
12345678      
32145645 3    
54359506 5   4
87686546 8 9 8
90948695 4   7

any ideas?

---------- Post updated at 07:19 AM ---------- Previous update was at 06:59 AM ----------

ok i've solved the sorting problem i think. i got rid of the -n's arguments. and it all combined together well. after that i sorted with -n.

Code:
4325325 3  3
4325459 7 5 
12345678 7  1
32145645 4 3 7
54359506 5  4
87686546 8 9 8
90948695 4 8 7
321435355 3

however the spacing problem is still there

---------- Post updated at 07:26 AM ---------- Previous update was at 07:19 AM ----------

Code:
Expected: 26825845 4 8
Got: 26825845  4 8

I almost have it. between the results and id, i have an extra space. how the hell do i get rid of it!
here's my updated code
Code:
sort ${ASS1_DATA_DIR}Prac1 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac1temp
sort ${ASS1_DATA_DIR}Prac2 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac2temp
join -a1 -a2 -1 1 -2 1 -o 0 1.2 2.2 ${ASS1_OUTPUT_DIR}Prac1temp ${ASS1_OUTPUT_DIR}Prac2temp > ${ASS1_OUTPUT_DIR}part1temp1
#use temp file.
sort ${ASS1_DATA_DIR}Prac3 | sed '/^#/d'> ${ASS1_OUTPUT_DIR}part1temp2
join -a1 -a2 -1 1 -o 0 1.2 1.3 2.2 -t " " ${ASS1_OUTPUT_DIR}part1temp1 ${ASS1_OUTPUT_DIR}part1temp2 > ${ASS1_OUTPUT_DIR}part1temp3
sort -n ${ASS1_OUTPUT_DIR}part1temp3 > ${ASS1_OUTPUT_DIR}Marks


Last edited by bigubosu; 04-30-2010 at 09:06 AM..
# 2  
Old 04-30-2010
Is that what you want ?
Code:
# awk '! /^#/ && NF -1 {a[$1]=(a[$1])?a[$1] FS $NF:$0}END{for(i in a){print a[i]}}' part1temp[123]| sort
12345678 7 1
321435355 3
32145645 4 3 7
4325325 3 3
4325459 7 5
54359506 5 4
87686546 8 9 8
90948695 4 8 7

Next time please post the required output(what you expect).
# 3  
Old 04-30-2010
Quote:
Originally Posted by danmero
Is that what you want ?
Code:
# awk '! /^#/ && NF -1 {a[$1]=(a[$1])?a[$1] FS $NF:$0}END{for(i in a){print a[i]}}' OFS="\t" part1temp[123]| sort
12345678 7 1
321435355 3
32145645 4 3 7
4325325 3 3
4325459 7 5
54359506 5 4
87686546 8 9 8
90948695 4 8 7

Next time please post the required output(what you expect).
Yeah thats right however i have to join -_-. sorry.

Sorry i'm just posting the required output so you know what im trying to get rid of. It's not what im expecting, cause the test data i have works. It's just when i submit my program to the system, it spits back what isn't right. and what isn't right is that i've got this extra space dangling around.
# 4  
Old 04-30-2010
Quote:
Originally Posted by bigubosu
Yeah thats right however i have to join -_-. sorry.
Hmm, why ?
Let's see the difference between my output and your output(sorted)
Code:
# diff -y my_output your_output
12345678 7 1                                                  | 12345678 7  1
321435355 3                                                     321435355 3
32145645 4 3 7                                                  32145645 4 3 7
4325325 3 3                                                   | 4325325 3  3
4325459 7 5                                                   | 4325459 7 5
54359506 5 4                                                  | 54359506 5  4
87686546 8 9 8                                                  87686546 8 9 8
90948695 4 8 7                                                  90948695 4 8 7


Last edited by danmero; 04-30-2010 at 10:11 AM.. Reason: Fix typo
# 5  
Old 04-30-2010
It's what is required for my lab work. I'd gladly use awk if i could.

i fixed the spacing. but the system i submitting doesn't like it.

Code:
sort ${ASS1_DATA_DIR}Prac1 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac1temp
sort ${ASS1_DATA_DIR}Prac2 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac2temp
join -a1 -a2 -2 1 -o 0 1.2 2.2 -e " " ${ASS1_OUTPUT_DIR}Prac1temp ${ASS1_OUTPUT_DIR}Prac2temp > ${ASS1_OUTPUT_DIR}part1temp1
#use temp file.
sort ${ASS1_DATA_DIR}Prac3 | sed '/^#/d'> ${ASS1_OUTPUT_DIR}part1temp2
join -a1 -a2 -1 1 -o 0 1.2 1.3 2.2 -e " " ${ASS1_OUTPUT_DIR}part1temp1 ${ASS1_OUTPUT_DIR}part1temp2 > ${ASS1_OUTPUT_DIR}part1temp3
sort -n ${ASS1_OUTPUT_DIR}part1temp3 > ${ASS1_OUTPUT_DIR}Marks

Code:
4325325 3   3
4325459 7 5  
12345678 7   1
32145645 4 3 7
54359506 5   4
87686546 8 9 8
90948695 4 8 7
321435355 3

# 6  
Old 04-30-2010
Quote:
Originally Posted by bigubosu
It's what is required for my lab work. I'd gladly use awk if i could.

Code:
sort ${ASS1_DATA_DIR}Prac1 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac1temp
sort ${ASS1_DATA_DIR}Prac2 | sed '/^#/d' > ${ASS1_OUTPUT_DIR}Prac2temp
join -a1 -a2 -2 1 -o 0 1.2 2.2 -e " " ${ASS1_OUTPUT_DIR}Prac1temp ${ASS1_OUTPUT_DIR}Prac2temp > ${ASS1_OUTPUT_DIR}part1temp1
#use temp file.
sort ${ASS1_DATA_DIR}Prac3 | sed '/^#/d'> ${ASS1_OUTPUT_DIR}part1temp2
join -a1 -a2 -1 1 -o 0 1.2 1.3 2.2 -e " " ${ASS1_OUTPUT_DIR}part1temp1 ${ASS1_OUTPUT_DIR}part1temp2 > ${ASS1_OUTPUT_DIR}part1temp3
sort -n ${ASS1_OUTPUT_DIR}part1temp3 > ${ASS1_OUTPUT_DIR}Marks

If that's the way you want/need, sorry but I can't help you win the Useless Use of Cat Award Smilie

Maybe you should take a look at Rules for Homework & Coursework Questions Forum - The UNIX and Linux Forums
# 7  
Old 04-30-2010
I've attempted it though? i'm sooooo close to getting this but i can't for the life of me see what is wrong.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies

2. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

3. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

4. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

5. UNIX for Dummies Questions & Answers

Use of join

I want basically to merge two config files each of them look like that File:Option1 value1 Optionx valuex .... One of those files is default config while the other is generated by my script. Now here is the problem, when my script generates an option that previously has been set by default i... (4 Replies)
Discussion started by: Adas
4 Replies

6. Shell Programming and Scripting

Join

joining two files: File1: ----- 1|M 2|M 3|F File2: ----- 1|abc|def 3|xyz|pqr join -t '|' 1.txt 2.txt gives(Itried many other ways) 1|M|abc|def 3|F|xyz|pqr but I need the result like following (14 Replies)
Discussion started by: greenworld
14 Replies

7. Programming

sql,multiple join,outer join issue

example sql: select a.a1,b.b1,c.c1,d.d1,e.e1 from a left outer join b on a.x=b.x left outer join c on b.y=c.y left outer join d on d.z=a.z inner join a.t=e.t I know how single outer or inner join works in sql. But I don't really understand when there are multiple of them. can... (0 Replies)
Discussion started by: robbiezr
0 Replies

8. Shell Programming and Scripting

Join

Hi, need some help in joining please, FILE 1: ------- 1|05/20/2009| 2|04/21/2009| 2|03/21/2009| FILE 2: ------- 1|Michel|Hawkins|05/10/2009| 1|Michel|Hawkins|03/10/2007| 2|Krish|Lander|09/10/2005| FILE 3: ------- 1|M|32|03/22/2009| 2|M|42|04/22/2009| 2|M|41|03/10/2008| I want to... (7 Replies)
Discussion started by: greenworld
7 Replies

9. Shell Programming and Scripting

join (pls help on join command)

Hi, I am a new learner of join command. Some result really make me confused. Please kindly help me. input: file1: LEO oracle engineer 210375 P.Jones Office Runner ID897 L.Clip Personl Chief ID982 S.Round UNIX admin ID6 file2: Dept2C ID897 6 years Dept5Z ID982 1 year Dept3S ID6 2... (1 Reply)
Discussion started by: summer_cherry
1 Replies
Login or Register to Ask a Question