04-19-2009
Joining files based on multiple keys
I need a script (perl or awk..anything is fine) to join 3 files based on three key columns. The no of non-key columns can vary in each file. The columns are delimited by semicolon.
For example,
File1
Dim1;Dim2;Dim3;Fact1;Fact2;Fact3;Fact4;Fact5
---- data delimited by semicolon ---
File2
Dim1;Dim2;Dim3;Fact6;Fact7;Fact8
---- data delimited by semicolon ---
File3
Dim1;Dim2;Dim3;Fact9;Fact10
---- data delimited by semicolon ---
So I have to join based on the keys (Dimensions 1,2 and 3) and output the data in one file
Output File
Dim1;Dim2;Dim3;Fact1;Fact2;Fact3;Fact4;Fact5;Fact6;Fact7;Fact8;Fact9;Fact10
---- data delimited by semicolon ---
It needs to be a full outer join. Say if a match is not found in one of the files, the fact values need to be filled as zeroes. Each file has header as well as data.
Thanks for your help.
10 More Discussions You Might Find Interesting
1. Programming
how can i marge two files depend som key
for example:
the first file include many records of information for X person
and the second file have one record of information for each X person
shortly i want to mak first :match between the two files then insert data from the second to the first... (2 Replies)
Discussion started by: Ehab
2 Replies
2. Shell Programming and Scripting
I've got two files, File1 and File2
File 1 has got combination of col1, col2 and col3 which comes on file2 as well, file2 does not get
col4. Now based on col1, col2 and col3, I would like to get col4 from file1 and all the columns from file2 in a new file
Any ideas?
File1
------
Col1 col2... (11 Replies)
Discussion started by: rudoraj
11 Replies
3. Shell Programming and Scripting
Hi
I have to join two files based on 1st column where 4th column of a2.txt=at and take 2nd column of a1.txt and 3rd column of a2.txt and check against source files ,if matches list those source file names.
a1.txt
a1|20090809|20090810
a2|20090907|20090908
a2.txt
a1|d|file1.txt|at... (9 Replies)
Discussion started by: akil
9 Replies
4. Shell Programming and Scripting
Hi,
I have below as i/p file:
5ABC 36488989 K 000010000ASB BYTRES
5PQR 45757754 K 000200005KPC HGTRET
5ABC 36488989 K 000045000ASB HGTRET
5GTH 36488989 K 000200200ASB BYTRES
5FTU ... (2 Replies)
Discussion started by: nirnkv
2 Replies
5. UNIX for Dummies Questions & Answers
Condition1;
If NPID and IndID of both input1 and input2 are same take all the vaues relevant to them and print together as output
Condition2;
IDNo in output: Take the highly repeated same letter of similar NPID-IndID as *1*
Second highly repeated same letter... (0 Replies)
Discussion started by: stateperl
0 Replies
6. UNIX for Dummies Questions & Answers
Hi guys,
I am a forum (and a bit of a unix) newbie, and I currently have a tricky problem lying ahead of me. I have multiple files, and I am looking to join the files on the first column.
Example:
File 1
andy b 100
amy c 200
amy d 300
File 2
andy c 200
amy c 100
clyde o 50
... (3 Replies)
Discussion started by: jdr0317
3 Replies
7. Shell Programming and Scripting
Hi,
I have nine files looking similar to file1 & file2 below.
File1:
1 ABCA1
1 ABCC8
1 ABR:N
1 ACACB
1 ACAP2
1 ACOT1
1 ACSBG
1 ACTR1
1 ACTRT
1 ADAMT
1 AEN:N
1 AKAP1File2:
1 A4GAL
1 ACTBL
1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies
8. Shell Programming and Scripting
Hi All,
Input.txt
123,ABC,XYZ1,A01,IND,I68,IND,NN
123,ABC,XYZ1,A01,IND,I67,IND,NN
998,SGR,St,R834,scot,R834,scot,NN
985,SGR0399,St,R180,T15,R180,T1,YY
985,SGR0399,St,R180,T15,R180,T1,NN
985,SGR0399,St,R180,T15,R180,T1,NN
2943,SGR?99,St,R68,Scot,R77,Scot,YY... (2 Replies)
Discussion started by: unme
2 Replies
9. Shell Programming and Scripting
Hello
I want to collapse a file with multiple rows into consolidated lines of entries based on selected columns as the 'key'.
Example:
1 2 3 Abc def ghi
1 2 3 jkl mno p qrts
6 9 0 mno def Abc
7 8 4 Abc mno mno abc
7 8 9 mno mno abc
7 8 9 mno j k
So if columns 1, 2 and 3 are... (6 Replies)
Discussion started by: linuxlearner123
6 Replies
10. Shell Programming and Scripting
Hi Folks,
I have a file with fields as follows which has last field in multiple lines. I would like to combine a line which has three fields with single field line for as shown in expected output. Please help.
INPUT
hname01 windows appnamec1eda_p1, ... (5 Replies)
Discussion started by: shunya
5 Replies
COLRM(1) BSD General Commands Manual COLRM(1)
NAME
colrm -- remove columns from a file
SYNOPSIS
colrm [start [stop]]
DESCRIPTION
The colrm utility removes selected columns from the lines of a file. A column is defined as a single character in a line. Input is read
from the standard input. Output is written to the standard output.
If only the start column is specified, columns numbered less than the start column will be written. If both start and stop columns are spec-
ified, columns numbered less than the start column or greater than the stop column will be written. Column numbering starts with one, not
zero.
Tab characters increment the column count to the next multiple of eight. Backspace characters decrement the column count by one.
ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of colrm as described in environ(7).
EXIT STATUS
The colrm utility exits 0 on success, and >0 if an error occurs.
SEE ALSO
awk(1), column(1), cut(1), paste(1)
HISTORY
The colrm command appeared in 3.0BSD.
BSD
August 4, 2004 BSD