Transpose matrix, and rearrange columns common with another file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Transpose matrix, and rearrange columns common with another file
# 1  
Old 02-27-2015
Transpose matrix, and rearrange columns common with another file

This is my first post, I apologize if I have broken rules.
Some assistance with the following will be very helpful.


I have a couple of files, both should ultimately have common columns only, arranged in the same order.

This file needs to be transposed, to bring the rows to columns

Code:
Series C1 C2 C3 C4 C5
S2 NULL 0 NA 1.2 00.34
S1 1 2 3 4 5
S3   45 45.6 6 6.7
S6 1 4 - 3.4 446
S4 5   5 6.8 5.6

This is another file, whose column ordering determines column ordering of first file.


Code:
   S1 S11 S2 S3 S4 S5 
M1 FOX 11 CAT FOO BAR1 FB 
M2 APP 12 OR TOM -1 AIR 
M3 - 13 - BASE 231 BETA




In my outputs, the first file is transposed and columns are outputted according to second file, only common columns. Also a pipe is added to rownames.

Code:
Series S1 S2 S3 S4 
C1| 1 NULL  5
C2| 2 0 45 
C3| 3 NA 45.6 5
C4| 4 1.2 6 6.8

Second file is outputted with common columns only.
Code:
   S1 S2 S3 S4  
M1 FOX CAT FOO BAR1 
M2 APP OR TOM -1 
M3 - - BASE 231

Files are tabbed, about 800x800 values having blank cells sometimes.


Original files are tab delimited, I have examples of space delimited here.
I am using cygwin on windows 7 machine.

My attempt below.

First sort file by 1st col

Code:
sort -t "<TAB>" -k1,1 file1 -o file1

then transpose first file


Code:
awk -F"\t" '
{ 
    for (i=1; i<=NF; i++)  {
        a[NR,i] = $i
    }
}
NF>p { p = NF }
END {    
    for(j=1; j<=p; j++) {
        str=a[1,j]
        for(i=2; i<=NR; i++){
            str=str"\t"a[i,j];
        }
        print str
    }
}' file1 > file1_trans


Not sure how to proceed after this step.
# 2  
Old 02-27-2015
How come the C5 line is missing in your output sample? Try
Code:
awk     'NR==1          {for (i=1; i<=NF; i++) EXCL[$i]}
         FNR==NR        {HOLD[NR]=$0; MXROW=NR; next}

         !($1 in EXCL) &&
         !(FNR==1)      {next}

                        {for (i=1; i<=NF; i++) T[FNR,i]=$i
                         if (NF>MXCOL) MXCOL=NF
                         PRT[FNR]++
                         delete EXCL[$1]
                        }

         END            {for (i=1; i<=MXCOL; i++) {for (j=1; j<=NR; j++) if (PRT[j]) printf "%s\t", T[j,i]; print _}

                         n=split(HOLD[1], TMP); for (j=1; j<=n; j++) if (TMP[j] in EXCL) DEL[j]++
                         for (i=1; i<=MXROW; i++)
                                {n=split(HOLD[i], TMP)
                                 printf "%s\t", TMP[1]
                                 for (j=2; j<=n; j++) if (!(DEL[j])) printf "%s\t", TMP[j]
                                 print _
                                }

                        }
        ' FS="\t" file2 file1
Series  S2      S1      S3      S4
C1      NULL    1       45      5
C2      0       2       45.6    5
C3      NA      3       6       6.8
C4      1.2     4       6.7     5.6
C5      00.34   5
        S1      S2      S3      S4
M1      FOX     CAT     FOO     BAR1
M2      APP     OR      TOM     -1
M3      -       -       BASE    231

---------- Post updated at 20:35 ---------- Previous update was at 20:27 ----------

For the pipe after the row names:
Code:
awk     'NR==1          {for (i=1; i<=NF; i++) EXCL[$i]}
         FNR==NR        {HOLD[NR]=$0; MXROW=NR; next}

         !($1 in EXCL) &&
         !(FNR==1)      {next}

                        {for (i=1; i<=NF; i++) T[FNR,i]=$i
                         if (NF>MXCOL) MXCOL=NF
                         PRT[FNR]++
                         delete EXCL[$1]
                        }

         END            {for (i=1; i<=MXCOL; i++)       {printf "%s%s\t", T[1,i], SEP; SEP="|"
                                                         for (j=2; j<=FNR; j++) if (PRT[j]) printf "%s\t", T[j,i]
                                                         print _
                                                        }

                         n=split(HOLD[1], TMP); for (j=1; j<=n; j++) if (TMP[j] in EXCL) DEL[j]++
                         for (i=1; i<=MXROW; i++)
                                {n=split(HOLD[i], TMP)
                                 printf "%s\t", TMP[1]
                                 for (j=2; j<=n; j++) if (!(DEL[j])) printf "%s\t", TMP[j]
                                 print _
                                }

                        }
        ' FS="\t" file2 file1
Series  S2      S1      S3      S4
C1|     NULL    1       45      5
C2|     0       2       45.6    5
C3|     NA      3       6       6.8
C4|     1.2     4       6.7     5.6
C5|     00.34   5
        S1      S2      S3      S4
M1      FOX     CAT     FOO     BAR1
M2      APP     OR      TOM     -1
M3      -       -       BASE    231

This User Gave Thanks to RudiC For This Post:
# 3  
Old 02-27-2015
Thank you RudiC. Please help with one more thing that I had mentioned, the column ordering must be the same in both files. In this case , we can use sorted ordering, S1, S2, S3 , S4 for both files. Sorry for the C5 missing, it should be there.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Transpose matrix based on second column using awk

Hi, Is it possible to transpose the matrix like this using awk ? Many thanks in advance Input abc Name_1 0 abc Name_2 1 abc Name_3 2 abc Name_4 0.4 def Name_1 0 def Name_2 9 def Name_3 78 def Name_4 1 Output abc def Name_1 0 ... (4 Replies)
Discussion started by: quincyjones
4 Replies

2. Shell Programming and Scripting

To transpose columns + edit in a txt file

Hi, I have a txt file that looks like log2FoldChange Ontology_term 8.50624450251828 GO:0003700,GO:0003707,GO:0005634,GO:0006355,GO:0043401,GO:0003700,GO:0005634,GO:0006355,GO:0008270,GO:0043565 7.03936870356684 GO:0005515,GO:0008080 6.49606183738682 6.49525073909629 GO:0005515... (4 Replies)
Discussion started by: alisrpp
4 Replies

3. Shell Programming and Scripting

how to rearrange a matrix with awk

Hi, every one. I have two files ,one is in matrix like this, one is a list with the same data as the matrix. AB AE AC AD AA AF SA 3 4 5 6 4 6 SC 5 7 2 8 4 3 SD 4 6 5 3 8 3 SE 45 ... (5 Replies)
Discussion started by: xshang
5 Replies

4. Shell Programming and Scripting

file merge based on common columns

I have two files 1.txt 34, ABC, 7, 8, 0.9 35, CDE, 6.5, -2, 0.01 2.txt 34, ABC, 9, 6, -1.9 35, CDE, 8.5, -2.3, 5.01 So in both files common columns are 1 and 2 so final o/p should look like 34, ABC, 7, 8, 0.9, 9, 6, -1.9 35, CDE, 6.5, -2, 0.01, 8.5, -2.3, 5.01 I tried using... (3 Replies)
Discussion started by: manas_ranjan
3 Replies

5. Shell Programming and Scripting

Transpose whole file and specific columns

Hi, I have a file like this a b c d e f g h i j k l Case1: I want to transpose the whole file Output1 a d g j b e h k c f i l Case2 Transpose a specific column - Say 3rd (6 Replies)
Discussion started by: jacobs.smith
6 Replies

6. Shell Programming and Scripting

Matrix transpose

Hello to everyone! I'm pretty tired and I cannot concentrate properly, but I need some help!!! I have a matrix like the one in the attachment (matrice_prova) and I would like an output like this: L1 L1 0.0 L1 L2 2.1 L1 L3 3.1 L1 L4 3.2 .... How is it possible do that in Perl or Awk?? Any... (3 Replies)
Discussion started by: gabrysfe
3 Replies

7. Shell Programming and Scripting

Transpose few columns alone in a Text file

Hi Friends, I am Stuck up with a problem on transposing Rows to Coloumns.. Though there are many threads on this my problem is little difficult.. I have a tab separated file like Below, computer selling_loc currency_type manufacturer_name salesweek-wk1 sales-wk2 ...wk-3 ..wk4 till... (7 Replies)
Discussion started by: heinz_holan
7 Replies

8. Shell Programming and Scripting

Extracting columns from a matrix and storing each column in a separate file

Hi All, I have a huge matrix file consisting some some millions rows and 6000 columns. The contents are just floating point numbers in the matrix. I want to extract each column (i.e. 6000 of them) and store each column in a separate file. For example, 1.dat will consist of elements from column... (4 Replies)
Discussion started by: shoaibjameel123
4 Replies

9. Shell Programming and Scripting

Rows to Columns - File Transpose

Hi I have an input file and I want to transpose it but I need to take care that if any field is missing for a record it should be popoulated with space for that field - using a shell script INFILE ---------- emp=1 sal=2 loc=abc emp=2 sal=21 sal=22 loc=xyz emp=5 loc=abc OUTFILE... (10 Replies)
Discussion started by: 46019
10 Replies

10. Shell Programming and Scripting

Need help with a script to rearrange columns

I have a file that is semi-colon delimited and the column headers are always the same but the column number is totally random each time this file is generated. I don't have the skills to make a script for this so maybe someone can help. I would like to be able to take this file which has over... (11 Replies)
Discussion started by: n3al10
11 Replies
Login or Register to Ask a Question