|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
How to use the the join command to join multiple files by a common column
Hi,
I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the resulting output file, the first column is the common column that is present in all 20 files and the following sets of two columns after that are the last two columns of each text file (i.e. columns 2 and 3 are columns 2 and 3 of GSM1.txt, columns 4 and 5 are columns 2 and 3 of GSM 2.txt and so on...) How do I go about doing that? Thanks! |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
join file1 file2 file3?
|
| The Following User Says Thank You to Corona688 For This Useful Post: | ||
evelibertine (07-05-2012) | ||
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Actually I get the error message join: extra operand `3.txt' When I try Code:
join 1.txt 2.txt 3.txt > output.txt
|
|
#4
|
||||
|
||||
|
Check join man page; join can take only 2 files at a time. You'll need it run it in a loop or through xargs.
|
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Thanks, I didn't realize that. Okay then: Code:
awk -F"\t" -v OFS="\t" 'F!=FILENAME { FNUM++; F=FILENAME }
{ COL[$1]++; C=$1; $1=""; A[C, FNUM]=$0 }
END {
for(X in COL)
{
printf("%s", X);
for(N=1; N<=FNUM; N++) printf("%s", A[X, N]);
printf("\n");
}
}' file1 file2 file3 file4 ... |
| The Following User Says Thank You to Corona688 For This Useful Post: | ||
evelibertine (07-05-2012) | ||
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
extend this to the number of files you have Code:
join GSN1.txt GSN2.txt > tmp.tmp
for f in GSN3.txt GSN4.txt GSN5.txt
do
join tmp.tmp $f > tmpf
mv tmpf tmp.tmp
done
mv tmp.tmp GSN_ALL.txt
cat GSN_ALL.txt |
| The Following User Says Thank You to jim mcnamara For This Useful Post: | ||
evelibertine (07-05-2012) | ||
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| how to join two files using "Join" command with one common field in this problem? | mindfreak | UNIX for Dummies Questions & Answers | 2 | 04-13-2012 05:55 AM |
| Perl join two files by "common" column | yifangt | Web Programming | 5 | 02-07-2011 08:30 AM |
| Join multiple files based on 1 common column | quincyjones | Shell Programming and Scripting | 9 | 12-17-2010 01:17 AM |
| Join multiple files by column with awk | macsx82 | Shell Programming and Scripting | 10 | 09-18-2010 04:56 PM |
| Join 2 files with multiple columns: awk/grep/join? | InfoSeeker | UNIX for Dummies Questions & Answers | 3 | 12-01-2009 07:45 PM |
|
|