How to use the the join command to join multiple files by a common column


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to use the the join command to join multiple files by a common column
# 1  
How to use the the join command to join multiple files by a common column

Hi,

I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column).

I want to write a script to join the files by the first common column so that in the resulting output file, the first column is the common column that is present in all 20 files and the following sets of two columns after that are the last two columns of each text file (i.e. columns 2 and 3 are columns 2 and 3 of GSM1.txt, columns 4 and 5 are columns 2 and 3 of GSM 2.txt and so on...)

How do I go about doing that? Thanks!
# 2  
join file1 file2 file3?
This User Gave Thanks to Corona688 For This Post:
# 3  
Actually I get the error message

join: extra operand `3.txt'


When I try
Code:
join 1.txt 2.txt 3.txt > output.txt

Smilie
# 4  
Check join man page; join can take only 2 files at a time. You'll need it run it in a loop or through xargs.
# 5  
Thanks, I didn't realize that.

Okay then:

Code:
awk -F"\t" -v OFS="\t" 'F!=FILENAME { FNUM++; F=FILENAME }

{       COL[$1]++;        C=$1; $1="";        A[C, FNUM]=$0 }

END {
        for(X in COL)
        {
                printf("%s", X);
                for(N=1; N<=FNUM; N++) printf("%s", A[X, N]);
                printf("\n");
        }
}' file1 file2 file3 file4 ...

This User Gave Thanks to Corona688 For This Post:
# 6  
extend this to the number of files you have
Code:
join GSN1.txt GSN2.txt > tmp.tmp     
for f in GSN3.txt GSN4.txt GSN5.txt  
do                                   
    join tmp.tmp $f > tmpf           
    mv tmpf tmp.tmp                  
done                                 
mv tmp.tmp GSN_ALL.txt               
cat GSN_ALL.txt

This User Gave Thanks to jim mcnamara For This Post:
 

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #103
Difficulty: Easy
In 2019, Linux was the most popular Unix variant on the market.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

2. Shell Programming and Scripting

Join 2nd column of multiple files

Dear All, I have many files formatted like this: file1.txt: 1/2-SBSRNA4 18 A1BG 3 A1BG-AS1 6 A1CF 0 A2LD1 1 A2M 1160 file2.txt 1/2-SBSRNA4 53 A1BG 1 A1BG-AS1 7 A1CF 0 A2LD1 3 A2M 2780 (5 Replies)
Discussion started by: paolo.kunder
5 Replies

3. Shell Programming and Scripting

Join common patterns in multiple lines into one line

Hi I have a file like 1 2 1 2 3 1 5 6 11 12 10 2 7 5 17 12 I would like to have an output as 1 2 3 5 6 10 7 11 12 17 any help would be highly appreciated Thanks (4 Replies)
Discussion started by: Harrisham
4 Replies

4. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

5. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

6. Web Development

Perl join two files by "common" column

Hello; I am posting to get any help on my code that I have been struggling for some time. The project is to join two files each with 80k~180k rows. I want to merge them together by the shared common column. The problem of the shared column is partially matching, not exactly the same. File1:... (5 Replies)
Discussion started by: yifangt
5 Replies

7. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

8. Shell Programming and Scripting

Join multiple files by column with awk

Hi all, I searched through the forum but i can't manage to find a solution. I need to join a set of files placed in a directory (~1600) by column, and obtain an output with first and second column common to each file, but following columns are taken from the file in the list (precisely the fourth... (10 Replies)
Discussion started by: macsx82
10 Replies

9. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

10. OS X (Apple)

Command line tool to join multiple .wmv files?

I need a simple command line executable that allows me to join many wmv files into one output wmv file, preferrably in a simple way like this: wmvjoin file1.wmv file2.wmv .... > outputfile.wmv So what I want is the wmv-equivalent of mpgtx I cannot find it on internet. Thanks. (2 Replies)
Discussion started by: karman
2 Replies

Featured Tech Videos