Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Joining files based on multiple keys Post 302308619 by Sebben on Sunday 19th of April 2009 06:45:05 PM
Old 04-19-2009
Joining files based on multiple keys

I need a script (perl or awk..anything is fine) to join 3 files based on three key columns. The no of non-key columns can vary in each file. The columns are delimited by semicolon.

For example,

File1

Dim1;Dim2;Dim3;Fact1;Fact2;Fact3;Fact4;Fact5
---- data delimited by semicolon ---

File2

Dim1;Dim2;Dim3;Fact6;Fact7;Fact8
---- data delimited by semicolon ---

File3
Dim1;Dim2;Dim3;Fact9;Fact10
---- data delimited by semicolon ---

So I have to join based on the keys (Dimensions 1,2 and 3) and output the data in one file

Output File
Dim1;Dim2;Dim3;Fact1;Fact2;Fact3;Fact4;Fact5;Fact6;Fact7;Fact8;Fact9;Fact10
---- data delimited by semicolon ---

It needs to be a full outer join. Say if a match is not found in one of the files, the fact values need to be filled as zeroes. Each file has header as well as data.

Thanks for your help.
 

10 More Discussions You Might Find Interesting

1. Programming

marge tow files based on keys

how can i marge two files depend som key for example: the first file include many records of information for X person and the second file have one record of information for each X person shortly i want to mak first :match between the two files then insert data from the second to the first... (2 Replies)
Discussion started by: Ehab
2 Replies

2. Shell Programming and Scripting

Joining two files based on columns/fields

I've got two files, File1 and File2 File 1 has got combination of col1, col2 and col3 which comes on file2 as well, file2 does not get col4. Now based on col1, col2 and col3, I would like to get col4 from file1 and all the columns from file2 in a new file Any ideas? File1 ------ Col1 col2... (11 Replies)
Discussion started by: rudoraj
11 Replies

3. Shell Programming and Scripting

joining files based on key column

Hi I have to join two files based on 1st column where 4th column of a2.txt=at and take 2nd column of a1.txt and 3rd column of a2.txt and check against source files ,if matches list those source file names. a1.txt a1|20090809|20090810 a2|20090907|20090908 a2.txt a1|d|file1.txt|at... (9 Replies)
Discussion started by: akil
9 Replies

4. Shell Programming and Scripting

Sum a column value based on multiple keys

Hi, I have below as i/p file: 5ABC 36488989 K 000010000ASB BYTRES 5PQR 45757754 K 000200005KPC HGTRET 5ABC 36488989 K 000045000ASB HGTRET 5GTH 36488989 K 000200200ASB BYTRES 5FTU ... (2 Replies)
Discussion started by: nirnkv
2 Replies

5. UNIX for Dummies Questions & Answers

any script for joining files based on simple conditions

Condition1; If NPID and IndID of both input1 and input2 are same take all the vaues relevant to them and print together as output Condition2; IDNo in output: Take the highly repeated same letter of similar NPID-IndID as *1* Second highly repeated same letter... (0 Replies)
Discussion started by: stateperl
0 Replies

6. UNIX for Dummies Questions & Answers

Joining string on multiple files

Hi guys, I am a forum (and a bit of a unix) newbie, and I currently have a tricky problem lying ahead of me. I have multiple files, and I am looking to join the files on the first column. Example: File 1 andy b 100 amy c 200 amy d 300 File 2 andy c 200 amy c 100 clyde o 50 ... (3 Replies)
Discussion started by: jdr0317
3 Replies

7. Shell Programming and Scripting

Joining multiple files based on one column with different and similar values (shell or perl)

Hi, I have nine files looking similar to file1 & file2 below. File1: 1 ABCA1 1 ABCC8 1 ABR:N 1 ACACB 1 ACAP2 1 ACOT1 1 ACSBG 1 ACTR1 1 ACTRT 1 ADAMT 1 AEN:N 1 AKAP1File2: 1 A4GAL 1 ACTBL 1 ACTL7 (4 Replies)
Discussion started by: seqbiologist
4 Replies

8. Shell Programming and Scripting

Find All duplicates based on multiple keys

Hi All, Input.txt 123,ABC,XYZ1,A01,IND,I68,IND,NN 123,ABC,XYZ1,A01,IND,I67,IND,NN 998,SGR,St,R834,scot,R834,scot,NN 985,SGR0399,St,R180,T15,R180,T1,YY 985,SGR0399,St,R180,T15,R180,T1,NN 985,SGR0399,St,R180,T15,R180,T1,NN 2943,SGR?99,St,R68,Scot,R77,Scot,YY... (2 Replies)
Discussion started by: unme
2 Replies

9. Shell Programming and Scripting

Combine multiple rows based on selected column keys

Hello I want to collapse a file with multiple rows into consolidated lines of entries based on selected columns as the 'key'. Example: 1 2 3 Abc def ghi 1 2 3 jkl mno p qrts 6 9 0 mno def Abc 7 8 4 Abc mno mno abc 7 8 9 mno mno abc 7 8 9 mno j k So if columns 1, 2 and 3 are... (6 Replies)
Discussion started by: linuxlearner123
6 Replies

10. Shell Programming and Scripting

awk joining multiple lines based on field count

Hi Folks, I have a file with fields as follows which has last field in multiple lines. I would like to combine a line which has three fields with single field line for as shown in expected output. Please help. INPUT hname01 windows appnamec1eda_p1, ... (5 Replies)
Discussion started by: shunya
5 Replies
STAG-FLATTEN(1p)					User Contributed Perl Documentation					  STAG-FLATTEN(1p)

NAME
stag-flatten - turns stag data into a flat table SYNOPSIS
stag-flatten -c name -c person/name dept MyFile.xml DESCRIPTION
reads in a file in a stag format, and 'flattens' it to a tab-delimited table format. given this data: (company (dept (name "special-operations") (person (name "james-bond")) (person (name "fred")))) the above command will return a two column table special-operations james-bond special-operations fred If there are multiple values for the columns within the node, then the cartesian product will be calculated USAGE
stag-flatten [-p PARSER] [-c COLS] [-c COLS] NODE <file> ARGUMENTS
-p|parser FORMAT FORMAT is one of xml, sxpr or itext xml assumed as default -c|column COL1,COL2,COL3,.. the name of the columns/elements to write out this can be specified either with multiple -c arguments, or with a comma-seperated (no spaces) list of column (terminal node) names after a single -c -n|nest if set, then the output will be a compress repeating values into the same row; each cell in the table will be enclosed by {}, and will contain a comma-delimited set of values SEE ALSO
Data::Stag perl v5.10.0 2008-12-23 STAG-FLATTEN(1p)
All times are GMT -4. The time now is 12:37 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy