Alternative to join command

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Alternative to join command
# 1  
Old 01-16-2018
Alternative to join command

Ubuntu, Bash 4.3.48

Hi,

I have 2 files and I want to join them (line by line if the start of the lines is the same, like a ID)

INPUT FILE 1 (tab delimited)
Code:
aa_12_12_v_c aaa,asf,afgas,eg
bb_12_43_a_d dad,ada,adaf,afa
cc_56_75_d_f asd,thh,ert,rtertet

INPUT FILE 2 (tab delimited)
Code:
aa_12_12_v_c 1:1:1:1:1
cc_56_75_d_f 2:2:2:2:2

INPUT FILE 3 (tab delimited)
Code:
bb_12_43_a_d 3:3:3:3:3



Using join

Code:
join -t "`echo -e "\t"`" -a1  FILE1 FILE2 > OUTPUT1

OUTPUT1 (tab delimited)
Code:
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1
bb_12_43_a_d dad,ada,adaf,afa
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2

Considering that in my case -e ND doesn't work Smilie I have to do this

Code:
awk 'FNR==NR{if(m<NF)m=NF;next}{for(i=NF;i<m;i++)$(i+1)="ND"}1' OUTPUT1 OUTPUT1 > XFILE; sed 's/ /\t/g' XFILE > OUTPUT2

OUTPUT2 (tab delimited)
Code:
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1
bb_12_43_a_d dad,ada,adaf,afa ND
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2

Then for the 3th file...

Code:
join -t "`echo -e "\t"`" -a1  OUTPUT2 FILE3 > OUTPUT3

OUTPUT3 (tab delimited)
Code:
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1 
bb_12_43_a_d dad,ada,adaf,afa ND 3:3:3:3:3
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2

Considering that in my case -e ND doesn't work Smilie I have to do this

Code:
awk  'FNR==NR{if(m<NF)m=NF;next}{for(i=NF;i<m;i++)$(i+1)="ND"}1'  OUTPUT3 OUTPUT3 > XFILE; sed 's/ /\t/g' XFILE > OUTPUT4

OUTPUT4 (tab delimited)
Code:
aa_12_12_v_c aaa,asf,afgas,eg 1:1:1:1:1 ND
bb_12_43_a_d dad,ada,adaf,afa ND 3:3:3:3:3
cc_56_75_d_f asd,thh,ert,rtertet 2:2:2:2:2 ND

--- --- ---

The point is that seem a little complicate my code... then, ofthe but not always I have problem with sorting... some time I have errors about sorting, when I apply the join command. I read that if I'm sure that my files are sorted I can bypass this sort-control-step of join command... but I want a new code without warnings...

Do you know any other command? Any help! commands, codes, script Smilie
Having N files I want to create a loop...

Many thanks!
echo manolis

Moderator's Comments:
Mod Comment Please use CODE tags (for data as well!) as required by forum rules!

Last edited by RudiC; 01-16-2018 at 06:27 PM.. Reason: Added CODE tags.
# 2  
Old 01-16-2018
How about
Code:
join -t"       " -a1 -o"1.1 1.2 1.3 2.2" -eND <(join -t"       " -a1 -o"1.1 1.2 2.2" -eND file[12]) file3
aa_12_12_v_c    aaa,asf,afgas,eg	1:1:1:1:1	ND
bb_12_43_a_d    dad,ada,adaf,afa	ND	3:3:3:3:3
cc_56_75_d_f    asd,thh,ert,rtertet	2:2:2:2:2	ND

# 3  
Old 01-17-2018
Thank you Rudic!

but I have the same sorting error in my original files. The files are sorted !!!

echo manolis

---------- Post updated at 09:21 AM ---------- Previous update was at 08:56 AM ----------

I saw several awk codes... I have to put my data in an array using an ID and match the line if they have the same ID... also to add in the output the lines that don't match

any help?

Last edited by echo manolis; 01-17-2018 at 05:11 AM..
# 4  
Old 01-20-2018
Please, someone could let me know!

Best
echo manolis
# 5  
Old 01-20-2018
Quote:
Originally Posted by echo manolis
Thank you Rudic!

but I have the same sorting error in my original files. The files are sorted !!!

echo manolis

---------- Post updated at 09:21 AM ---------- Previous update was at 08:56 AM ----------

I saw several awk codes... I have to put my data in an array using an ID and match the line if they have the same ID... also to add in the output the lines that don't match

any help?
Quote:
Originally Posted by echo manolis
Please, someone could let me know!

Best
echo manolis
When you tell us you have a sorting problem with your input files and tell us that your input files are sorted, you leave us wondering:
  1. What is your sorting problem?
  2. Why do you think there is a sorting problem?
  3. What output did you get from your attempts to use join that led you to believe that you had a sorting problem?
As with all threads in this forum, you know that knowing what operating system you're using and what shell you're using helps us help you. And, you have not told us either of these key bits of information.

If echo -e doesn't work on your system (or in your shell), why not just use a literal tab character when specifying the field delimiter in your join commands? If you're afraid that someone reading your code won't be able to tell the difference between a <space> and a <tab>, why not include a comment in your code explaining that the delimiter is a <tab> character entered literally? If writing comments is unacceptable for you for some reason, why not use a command substitution that is portable:
Code:
join -t "$(printf '\t')" -a1  OUTPUT2 FILE3 > OUTPUT3

or, if you're using a pure Bourne shell:
Code:
join -t "`printf '\t'`" -a1  OUTPUT2 FILE3 > OUTPUT3

instead of using echo -e which is clearly not portable?

Why wait until you update post #3 in this thread to tell us what your real requirements are? Why not spend the time when you first started your thread to explain what you were trying to do? Why when you changed your requirements in post #3 didn't you include sample input and output that would help us understand what you're trying to do?

When you don't tell us what OS and shell you're using, don't clearly explain what you're trying to do, and don't show us sample input and corresponding output for the problem you're trying to solve; you make it hard for anyone to get interested in trying to help you.

When you give us details about your environment, give us a clear specification of what you're trying to do, show us sample inputs and outputs that correspond to that specification, and show us code that you have attempted to use to solve your problem on your own; you will be much more likely to get help.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Alternative to cp command

Good Afternoon, I'm backing up a folder from one NAS to another using a unix script using cp. Its a lot of files and takes several days to complete. Most of the files don't change from week to week. Is there a command that would be quicker? Also note, the backup needs to be ready-to-use in... (5 Replies)
Discussion started by: Stellaman1977
5 Replies

2. Shell Programming and Scripting

Alternative command/method to curl

is there a different way to do the following: curl -k -H "Content-Type:application/json" -X POST -d'{"api_token": "33blah526c-6bla71-441b-491b-0blahca08"}' https://10.10.10.10/api/1.4/auth/session -c /tmp/myhost01.myhost.com im seeking to use a different method because i'm running into TLS... (1 Reply)
Discussion started by: SkySmart
1 Replies

3. Shell Programming and Scripting

Maxdepth command not working in AIX.Need alternative solution for this command

Hi All, I am trying to select 30 days older files under current directory ,but not from subdirectory using below command. find <Dir> -type f -mtime + 30 This command selecting all the files from current directory and also from sub directory . I read some documention through internet ,... (1 Reply)
Discussion started by: kommineni
1 Replies

4. AIX

Alternative command for topas

hi, I need alternative command for topas to check cpu %, i tried with ps but their is lot of diffference between the outputs of two commands... Thanks (3 Replies)
Discussion started by: sumanthupar
3 Replies

5. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

6. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

7. Homework & Coursework Questions

locate command alternative,,

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! Ok, Im back with another small problem. I created a script (the one posted in the last thread). After some help from some members here all is good. The problem is I made it... (4 Replies)
Discussion started by: ozman911
4 Replies

8. Shell Programming and Scripting

Any alternative of sar command

Hi all, I am using linux box ...i dont find the manual entry of sar command through man sar ...it is in unix not in linux although i have to check the cpu utilization and paging...any alternative of sar command.. Thanks Vijay Sahu (1 Reply)
Discussion started by: vijays3
1 Replies

9. UNIX for Dummies Questions & Answers

alternative for head command

Hi friends,I am new to unix and this is really a dummy question.but please help me out. How to simulate head command without using head command??? also tail command too,also more command. it is given as a homework to do....please tell me how to do (2 Replies)
Discussion started by: nikhilneela
2 Replies

10. Shell Programming and Scripting

join (pls help on join command)

Hi, I am a new learner of join command. Some result really make me confused. Please kindly help me. input: file1: LEO oracle engineer 210375 P.Jones Office Runner ID897 L.Clip Personl Chief ID982 S.Round UNIX admin ID6 file2: Dept2C ID897 6 years Dept5Z ID982 1 year Dept3S ID6 2... (1 Reply)
Discussion started by: summer_cherry
1 Replies
Login or Register to Ask a Question