Merge multiple tables into big matrix


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge multiple tables into big matrix
# 1  
Old 06-11-2012
Merge multiple tables into big matrix

Hi all,

I have a complex (beyond my biological expertise) problem at hand.
I need to merge multiple files into 1 big matrix. Please help me with some code.


Inp1
Code:
Ang_0    chr1    98    T    A    
Ang_0    chr1    352    G    A    
Ang_0    chr1    425    C    T    
Ang_0    chr2    471    T    G    
Ang_0    chr2    508    T    -


Inp2
Code:
Bng_0    chr1    98    T    G    
Bng_0    chr1    352    G    A        
Bng_0    chr2    471    T    A    
Bng_0    chr2    508    T    -

Inp3
Code:
Cng_0    chr1    198    T    A    
Cng_0    chr1    352    G    A    
Cng_0    chr1    425    C    T    
Cng_0    chr2    471    T    G


Outp
Code:
            
           Ang_0    Bng_0 Cng_0    
chr1    98    A    G    T
chr1    198    T    T    A
chr1     352    A    A    A
chr1    425    T    C    T
chr2    471    G    A    G
chr2    508    -    -    T



Input files have 5 columns, 1=organism name, 2=chormosome number, 3=chromosome position,4=reference,5=Alternate

First columns 2 and 3 have to be matched in all the input files, if all files have a record for a particular column2 and 3 value
then column5 value has to be outputted. If an input file does not have a record matching a particular column2 and 3 values, column
4 value from any of the input files having that record has to be printed in the output. The column names in the output files
will be the organism name (column 1 of input files.)

I have 123 files and ~30,000 rows in each file.
So the output will have 125 columns, columns 3 through 125 are organism names, column 1 and 2 are chromosome and position.

Please let me know if I am not clear in my requirements. Thanks a lot in advance.
# 2  
Old 06-11-2012
Give this a go:

Code:
awk '!($1 in colnum) {Title[++col]=$1;colnum[$1]=col}
{i=colnum[$1]
 def[$2,$3]=$4
 Val[i,$2,$3]=$5
}
END{ $0=""
    for(i=1;i<=col;i++) $(i+2)=Title[i]
    print
    for(v in def) {
       $0=""
       split(v, f, SUBSEP)
       $1=f[1]
       $2=f[2]
       for(i=1;i<=col;i++) {
          if((i SUBSEP f[1] SUBSEP f[2]) in Val) $(i+2)=Val[i,f[1],f[2]]
          else $(i+2)= def[f[1],f[2]]
       }
       print
    }
}' OFS='\t' inp* > result_file

This User Gave Thanks to Chubler_XL For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Export Oracle multiple tables to multiple csv files using UNIX shell scripting

Hello All, just wanted to export multiple tables from oracle sql using unix shell script to csv file and the below code is exporting only the first table. Can you please suggest why? or any better idea? export FILE="/abc/autom/file/geo_JOB.csv" Export= `sqlplus -s dev01/password@dEV3... (16 Replies)
Discussion started by: Hope
16 Replies

2. Shell Programming and Scripting

Compare and merge two big CSV files

Hi all, i need help. I have two csv files with a huge amount of data. I need the first column of the first file, to be compared with the data of the second, to have at the end a file with the data not present in the second file. Example File1: (only one column) profile_id 57036226... (11 Replies)
Discussion started by: SirMannu
11 Replies

3. UNIX for Dummies Questions & Answers

How to merge two tables based on a matched column?

Hi, Please excuse me , i have searched unix forum, i am unable to find what i expect , my query is , i have 2 files of same structure and having 1 similar field/column , i need to merge 2 tables/files based on the one matched field/column (that is field 1), file 1:... (5 Replies)
Discussion started by: karthikram
5 Replies

4. Shell Programming and Scripting

Multiple files to load into different tables

multiple files to load into different tables, I have a script show below, but this script loads data from txt file into a table, but i have multiple input files(xyzload.txt,xyz1load.txt,xyz2load.txt......) in the unix folder , can we load these files in diff tables (table 1, table2... (1 Reply)
Discussion started by: nani1984
1 Replies

5. Shell Programming and Scripting

merge multiple tables with perl

Hi everyone, I once again got stuck with merging tables and was wondering if someone could help me out on that problem. I have a number of tab delimited tables which I need to merge into one big one. All tables have the same header but a different number of rows (this could be changed if... (6 Replies)
Discussion started by: TuAd
6 Replies

6. Web Development

mysql query for multiple columns from multiple tables in a DB

Say I have two tables like below.. status HId sName dName StartTime EndTime 1 E E 9:10 10:10 2 E F 9:15 10:15 3 G H 9:17 10:00 logic Id devName capacity free Line 1 E 123 34 1 2 E 345 ... (3 Replies)
Discussion started by: ilan
3 Replies

7. Shell Programming and Scripting

Merge Two Tables with duplicates in first table

Hi.. File 1: 1 aa rep 1 dd rep 1 kk rep 2 bb sad 2 ss sad 3 ee dam File 2 1 apple fruit 2 mango tree 3 lilly flower output: 1 aaple fruit aa,dd,kk rep (7 Replies)
Discussion started by: empyrean
7 Replies

8. Programming

How to define a very big matrix in C?

Hello!! I need to do some performance test using a very big matrix (bi-dimensional array) but I have problems with this. Is there any limitation in declarations? because if I do this: int matriz; It just don't work... it compiles but when i run the program it just closes. Where can i... (4 Replies)
Discussion started by: Sandia_man
4 Replies

9. Shell Programming and Scripting

Merge 70 files into one data matrix

Hi, I have a list of 70 files in a directory and I need to merge the content of each file into one big matrix file (71 columns x 3060 rows). Each file has the following format only two columns per file: unique identifier1 randomtext1 randomtext1 a 5 b 3 c 6 d 3 e 2... (11 Replies)
Discussion started by: labrazil
11 Replies

10. UNIX for Dummies Questions & Answers

It's about MySQL setting called "big tables"

Friends, I'm a Unix/Linux Newbie and I do need some HOWTO help with my dedicated server: Linux Redhat 7, EnsimPRo 3.5 control panel == It's about MySQL setting called "big tables" == I have a problem with MySQL when importing large category tree textfiles into the database: I have MySQL... (1 Reply)
Discussion started by: Kayalame
1 Replies
Login or Register to Ask a Question