Creating a matrix out of a longitudinal data set


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Creating a matrix out of a longitudinal data set
# 1  
Old 02-04-2015
Creating a matrix out of a longitudinal data set

Hi
I do have a tab delimited file with 2 columns, which is stratified based on the first column. There are 1000's of values in the file.

Below is an example of the input file

Code:
1 AB
1 AC
1 CC
1 DD
2 AB
2 CC
2 AC
2 AB
3 CF
3 CC
3 DD
4 AC
4 CC
4 AD

I would like to create matrix with 1's and 0's such that each of the repeated numbers will be the rows of the matrix and 1's represent whether the item in column 2 is present or not indicated by 1's and 0's

this is the desired output

Code:
   AB  AC  CC   DD  CF AD
1  1   1   1    1   0   0
2  1   1   1    0   0   0
3  1   0   1    1   1   0
4  0   1   1    0   0   1

It would be great if you could let me know the best way to solve this either by awk or sed.
# 2  
Old 02-04-2015
Code:
awk     '       {LN[$1]; HD[$2]; MX[$1,$2]=1}
         END    {               printf "%10s", ""; for (i in HD) printf "%10s", i; print "";
                 for (j in LN) {printf "%10s",j;   for (i in HD) printf "%10s", MX[j,i]+0; print ""}
                }
        ' file
                  AB        AC        AD        CC        CF        DD
         1         1         1         0         1         0         1
         2         1         1         0         1         0         0
         3         0         0         0         1         1         1
         4         0         1         1         1         0         0

This User Gave Thanks to RudiC For This Post:
# 3  
Old 02-05-2015
Thanks. It worked. But the header of the matrix is not space or tab delimited. Also, the actual file contains more than 2 letters on the second column where as the example file I provided contained only 2 letters.
# 4  
Old 02-05-2015
So ... ?
# 5  
Old 02-05-2015
So I want to have the header space delimited. I tried adding \t and \s in print "" of the above code. but it is giving me an error message. It would be great if I could get some help in this matter
# 6  
Old 02-05-2015
man awk

man printf
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

C++: Creating Matrix template using vector

I want to create a Matrix template that uses vector. For the time being I want to create the following operations. I need setting the implementation for the operations. Maybe I do not have to use a pointer either. template <class T> class Matrix { protected: typedef vector<T>* ... (2 Replies)
Discussion started by: kristinu
2 Replies

2. Shell Programming and Scripting

Creating matrix from folders and subfolders

Hello, Greetings! please help me produce the following solution. I need to produce one big matrix file from several files in different levels. If it helps, the index folder provides information on chromosome index and the data folder provides information on values for chromosomes. there... (8 Replies)
Discussion started by: newbie83
8 Replies

3. Shell Programming and Scripting

Perl- creating a matrix from a 3 column file

Dear all, I'm new in perl scripting and I'm trying to creating a matrix from a 3 column file sorting data in a particular manner. In the final matrix I need to have the first column "IDs" on the header of the columns and the second column values on the header of each row. And the value fo the... (2 Replies)
Discussion started by: gabrysfe
2 Replies

4. Shell Programming and Scripting

Creating Matrix from file

Hi all, I'm a newbie in shell scripting and currently I'm trying to create a matrix using bash. The Output will look like this AB CDE FG 1 2 3 4 5 6 7 I'm stuck on the ABCDEFG display. printFlightSeats() { rows=7 columns=7 for ((i=0;i<=$rows;i++)) do (2 Replies)
Discussion started by: vinzping
2 Replies

5. UNIX for Dummies Questions & Answers

BASH - Creating a Matrix

I'm trying to create a Matrix using bash. The expected output is .AB CDE FG 1 2 3 4 5 6 7 I'm a newbie in shell language, really appreciate if there is anyone who can guide me with this. Double post again, continued here (0 Replies)
Discussion started by: vinzping
0 Replies

6. Ubuntu

Creating Matrix

Hi all, I'm a newbie in shell scripting and currently I'm trying to create a matrix using bash. The Output will look like this AB CDE FG 1 2 3 4 5 6 7 I'm stuck on the ABCDEFG display. printFlightSeats() { rows=7 columns=7 for ((i=0;i<=$rows;i++)) do (0 Replies)
Discussion started by: vinzping
0 Replies

7. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

8. Shell Programming and Scripting

Creating a matrix from files.

I need to create a large matrix so that I can feed that matrix to MATLAB for processing. The problem is creating that matrix because my data is completely scattered around files. 1. I have one big dictionary file which has words in newlines, like apple orange pineapple 2. I have some... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

9. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333... (7 Replies)
Discussion started by: ssshen
7 Replies

10. Shell Programming and Scripting

Merge 70 files into one data matrix

Hi, I have a list of 70 files in a directory and I need to merge the content of each file into one big matrix file (71 columns x 3060 rows). Each file has the following format only two columns per file: unique identifier1 randomtext1 randomtext1 a 5 b 3 c 6 d 3 e 2... (11 Replies)
Discussion started by: labrazil
11 Replies
Login or Register to Ask a Question