Merge files by col value


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge files by col value
# 1  
Old 07-02-2013
Merge files by col value

Hi,

Please help, this is quite complex, I dont know how to start.
The original input files are 10mb in size each.


I have multiple files and I want to merge them in the following way.
Every file has 4 columns. Col1 and col2 are fixed with respect to each other. In the example value A and B in col 2 always come with value 1 in col 1, C and D in col 2 always have 2 in col 1.
The columns must be ordered, the 3rd col and the 4th col of all files stay together.Header must be formed (like example)with file names appended by col number. col 2 does not repeat within a particular file.
Original input files do not have header.

Code:
 
cat File1
 
1 A 2 4
1 B 1 2
 
cat File2
 
1 B 1 4
2 C 2 4
 
cat File3
 
2 C 5 6
2 D 4 5
 
Expected output
 
col1 col2 File1col3 File2col3 File3col3 File1col4 File2col4 File3col4
1 A 2 0 0 4 0 0
1 B 1 1 0 2 4 0
2 C 0 2 5 0 4 6
2 D 0 0 4 0 0 5

# 2  
Old 07-02-2013
what have you tried so far? If you can post your effort we can direct you in right direction..
# 3  
Old 07-02-2013
Code:
awk '{for(i=3;i<=NF;i++){A[$1,$2,FILENAME,i]=$i};B[$1,$2]++;C[FILENAME]++}
    END{for(ind in B){split(ind,P,SUBSEP);S=P[1]" "P[2];
    for(i=1;i<=2;i++){for(file in C){
    t=A[P[1],P[2],file,i+2]?A[P[1],P[2],file,i+2]:"0"
    S=S" "t}}print S}}' File1 File2 File3

1 B 1 1 0 2 4 0
2 C 0 2 5 0 4 6
2 D 0 0 4 0 0 5
1 A 2 0 0 4 0 0

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merge files and generate a resume in two files

Dear Gents, Please I need your help... I need small script :) to do the following. I have a thousand of files in a folder produced daily. I need first to merge all files called. txt (0009.txt, 0010.txt, 0011.txt) and and to output a resume of all information on 2 separate files in csv... (14 Replies)
Discussion started by: jiam912
14 Replies

2. Shell Programming and Scripting

Modifying col values based on another col

Hi, Please help with this. I have several excel files (with and .xlsx format) with 10-15 columns each. They all have the same type of data but the columns are not ordered in the same way. Here is a 3 column example. What I want to do add the alphabet from column 2 to column 3, provided... (9 Replies)
Discussion started by: newbie83
9 Replies

3. Shell Programming and Scripting

Printing from col x to end of line, except last col

Hello, I have some tab delimited data and I need to move the last col. I could hard code it, awk '{ print $1,$NF,$2,$3,$4,etc }' infile > outfile but it would be nice to know the syntax to print a range cols. I know in cut you can do, cut -f 1,4-8,11- to print fields 1,... (8 Replies)
Discussion started by: LMHmedchem
8 Replies

4. Shell Programming and Scripting

Checking in a directory how many files are present and basing on that merge all the files

Hi, My requirement is,there is a directory location like: :camp/current/ In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Discussion started by: srikanth_sagi
10 Replies

5. UNIX for Dummies Questions & Answers

how to join files with diff col # and row #?

I am a new user of Unix/Linux, so this question might be a bit simple! I am trying to join two (very large) files that both have different # of cols and rows in each file. I want to keep 'all' rows and 'all' cols from both files in the joint file, and the primary key variables are in the rows.... (1 Reply)
Discussion started by: BNasir
1 Replies

6. UNIX for Advanced & Expert Users

Print line based on highest value of col (B) and repetion of values in col (A)

Hello everyone, I am writing a script to process data from the ATP world tour. I have a file which contains: t=540 y=2011 r=1 p=N409 t=540 y=2011 r=2 p=N409 t=540 y=2011 r=3 p=N409 t=540 y=2011 r=4 p=N409 t=520 y=2011 r=1 p=N409 t=520 y=2011 r=2 p=N409 t=520 y=2011 r=3 p=N409 The... (4 Replies)
Discussion started by: imahmoud
4 Replies

7. Shell Programming and Scripting

AWK: list files with 1rst col=N and char position 123=N

I need to list all files where 1rst column=ABK and char position 123 to 125=ZBK: For the first part I can I can do a awk '{$1="ABK";print}' file and for the second a cut -c123-125 file | grep ZBK but this would only work partially.. How can I do this with only one awk command ? Thanks in... (10 Replies)
Discussion started by: cabrao
10 Replies

8. Ubuntu

Match col 1 of File 1 with col 1 File 2 and create a 3rd file

Hello, I have a 1.6 GB file that I would like to modify by matching some ids in col1 with the ids in col 1 of file2.txt and save the results into a 3rd file. For example: File 1 has 1411 rows, I ignore how many columns it has (thousands) File 2 has 311 rows, 1 column Would like to... (7 Replies)
Discussion started by: sogi
7 Replies

9. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

10. Shell Programming and Scripting

compare two col from 2 files, and output uniq from file 1

Hi, I can't find how to achive such thing, please help. I have try with uniq and comm but those command can't compare columns just whole lines, I think awk will be the best but awk is magic for me as of now. file a a1~a2~a3~a4~a6~a7~a8 file b b1~b2~b3~b4~b6~b7~b8 output 1: compare... (2 Replies)
Discussion started by: pp56825
2 Replies
Login or Register to Ask a Question