Awk - join multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk - join multiple files
# 1  
Old 07-01-2012
Awk - join multiple files

Is it possible to join all the files with input1 based on 1st column?

input1
Code:
a
b
c
d
e
f

input2
Code:
a
b

input3
Code:
a
e

input4
Code:
c
d
e
f

inputx
Code:
a

Code:
awk -f join.awk input1 input2 input3 input4 inputx .....

output
Code:
a  4  input1  input2  input3  inputx
b  2  input1  input2  0  0  0
c  2  input1  0  0  input4  0
d  2  input1  0  0  input4  0
e  3  input1  0  input3  input4  0
f  2  input1  0  0  input4  0

note: 2nd column represents how many times the symbol appeared in x nummber of inputs
# 2  
Old 07-02-2012
Code:
$ cat join.awk
{
 totalcnt[$1]++
 if(FILENAME != prevfilename)
  filesc++
 filecnt[filesc,$1]++
 fname[filesc]=prevfilename=FILENAME
}

END {
 for(i in totalcnt)
 {
  printf("%s\t%s",i,totalcnt[i])
  for(j=1;j<=filesc;j++)
  {
   printf("\t%s[%d]",fname[j],filecnt[j,i])
  }
  printf("%s","\n");
 }
}


Last edited by elixir_sinari; 07-02-2012 at 01:32 AM..
This User Gave Thanks to elixir_sinari For This Post:
# 3  
Old 07-02-2012
Slight mod to elixir_sinari's post to match OP request for filenames not count:

Code:
$ cat join.awk
{
 totalcnt[$1]++
 filesc+=(FNR==1)
 filecnt[filesc,$1]=FILENAME
}
END {
 OFS="\t"
 for(i in totalcnt) {
  $0=""; $1=i; $2=totalcnt[i]
  for(j=1;j<=filesc;j++)
     $(j+2)=filecnt[j,i]?filecnt[j,i]:"0"
  print
 }
}

This User Gave Thanks to Chubler_XL For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies

2. Shell Programming and Scripting

Join files on multiple fields

Hello all, I want to join 2 tabbed files on the first 2 fields, and filling the missing values with 0. The 3rd column in each file is constant for the entire file. file1 12658699 ST5 XX2720 0 1 0 1 53039541 ST5 XX2720 1 0 1.5 1 file2 ... (6 Replies)
Discussion started by: sheetalk
6 Replies

3. Shell Programming and Scripting

Join multiple files with filename

Please help, I want to join multiple files based on column 1, and put the missing values as 0. Also the colname in the output should say which file the values came from. FILE1 1 11 2 12 3 13 FILE2 2 22 3 23 4 24 FILE3 1 31 3 33 4 34 FILE1 FILE2 FILE3 1 11 0 31 (1 Reply)
Discussion started by: newbie83
1 Replies

4. Shell Programming and Scripting

Join multiple files

Hi there, I am trying to join 24 files (i showed example of 3 files below). They all have 2 columns. The first columns is common to all. The files are tab delimited eg file 1 rs0001 100e-34 rs0003 2.8e-01 rs008 1.9e-90 file 2 rs0001 1.98e-22 rs0004 3.77e-10... (4 Replies)
Discussion started by: fat
4 Replies

5. Shell Programming and Scripting

Sort and join multiple columns using awk

Is it possible to join all the values after sorting them based on 1st column key and replace empty rows with 0 like below ? input a1 0 a1 1 a1 1 a3 1 b2 1 a2 1 a4 1 a2 1 a4 1 c4 1 a3 1 d1 1 a3 1 b1 1 d1 1 a4 1 c4 1 b2 1 b1 1 b2 1 c4 1 d1 1 output... (8 Replies)
Discussion started by: quincyjones
8 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

Join multiple files by column with awk

Hi all, I searched through the forum but i can't manage to find a solution. I need to join a set of files placed in a directory (~1600) by column, and obtain an output with first and second column common to each file, but following columns are taken from the file in the list (precisely the fourth... (10 Replies)
Discussion started by: macsx82
10 Replies

8. Shell Programming and Scripting

How to join multiple files?

I am trying to join a few hundred files using join. Is there a way to use while read or something else to automate this. My problem is the following. Day 1 City Temp ABC 20 DEF 30 HIJ 15 Day 2 City Temp ABC 22 DEF 29 KLM 5 Day 3 (3 Replies)
Discussion started by: theFinn
3 Replies

9. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

10. Shell Programming and Scripting

Awk Join multiple lines

Hi, I have data with broken lines: Sample data: "12"|"25"|"a"|"b"|"c"|"d"|"e"|"f"|"2453748"|"08:10:50" "16"|"25"|"a"|"b"|"c"|"d"|"e"|"f"|" 2453748"|"08:15:50" "16"|"25"|"a"|"b"|" c"|"d"|"e"|"f"|"2453748"|"08:19:50" "16"|"25"|"a"|"b"|"c"|"d"|"e"|"f"|"2453748"|"08:19:50" In the... (5 Replies)
Discussion started by: hitmansilentass
5 Replies
Login or Register to Ask a Question