Reformatting data in matrix form Post: 302609791

Sponsored Content

Top Forums Shell Programming and Scripting Reformatting data in matrix form Post 302609791 by newbie83 on Tuesday 20th of March 2012 12:20:49 PM

03-20-2012

Registered User

Reformatting data in matrix form

Hi,

Some assistance with respect to the following problem will be very helpful.
I want to reformat my dataset in the following manner for subsequent analysis.

I have first column values (which repeat for each value of 2nd column) which are names, the second column specifies position ad the third column is
the 1st value, fourth column is 2nd value. I want to put the names as column headers and the values for a particular position as the value of the 4th column in the input. In case of missing record, it should take the value of the third column of any record for that position.

For example in the input dataset, A through D are the names and C does not occur for pos2, and B,C,D does not occur for pos3. So the value of C for pos2 will be taken from the tird column of any record for pos2 which is 9 (third column is constant for a particular pos ). For pos3, only B will have value 9 while A, C and D will have 7 (third column for pos3).

For the record, I have 80 names and 15677899 records in my actual dataset.

Input

Code:

A pos1 1 2
B pos1 1 3
C pos1 1 4
D pos1 1 5
A pos2 9 6
B pos2 9 7
D pos2 9 8
B pos3 7 9

Expected output

Code:

       A B C D
pos1   2 3 4 5
pos2   6 7 9 8
pos3   7 9 7 7

Last edited by newbie83; 03-20-2012 at 01:33 PM..

newbie83

View Public Profile for newbie83

Find all posts by newbie83

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

changing data into matrix form

Hi, I have a file whose structure is like this 7 7 1 2 3 4 5 1 3 4 8 6 1 4 5 6 0 2 6 8 3 8 2 5 7 8 0 5 7 9 4 1 3 8 0 2 2 3 5 6 8 basically first two row tell the number of rows and column but the data following them are not arranged in that format. now i want to create another...

2. Shell Programming and Scripting

Reformatting Data in AWK

Dear AWK Users, I have a data set that is so large (Gigabytes) that it cannot be opened in the vi editor in its entirety. But I can manipulate the entire thing in AWK. It is formatted in a regular manner such that it has the variable descriptions or listings preceeding the variables. The latter...

3. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333...

4. Shell Programming and Scripting

Cut and paste data in matrix form

I have large formatted data file with five columns. This has to be rearranged in lower order matrix form as shown below for sample data. 1 2 3 4 5 1.0 3.0 2.0 5.0 3.0 2.0 4.0 3.0 1.0 6.0 2.0 3.0 4.0 5.0 1.0 1.0 4.0 2.0 3.0 5.0 3.0 5.0 4.0 2.0 8.0 1.0 3.0 2.0 4.0 5.0 2.0...

5. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ...

6. Shell Programming and Scripting

convert data into matrix- awk

is it possible to count the number of keys based on state and cell and output it as a simple matrix. Ex: cell1-state1 has 2 keys cell3-state1 has 4 keys. Note: Insert 0 if no data available. input key states cell key1 state1 cell1 key1 state2 cell1 key1 ...

7. Shell Programming and Scripting

Transpose Data form Different form

HI Guys, I have data in File A.txt RL03 RL03_A_1 RL03_B_1 RL03_C_1 RL03 -119.8 -119.5 -119.5 RL07 RL07_A_1 RL07_B_1 RL07_C_1 RL07 -119.3 -119.5 -119.5 RL15 RL15_A_1 RL15_C_1 RL15 -120.5 -119.4 RL16...

8. Shell Programming and Scripting

How order a data matrix using awk?

is it possible to order the following row clusters from ascending to descending. thanx in advance input 1 2 4 0 1 2 4 0 3 3 3 3 1 5 1 0 1 5 1 0 6 0 0 0 5 1 1 1...

9. UNIX for Dummies Questions & Answers

Form balanced matrix by filtering data

I need to form a matrix out of unbalanced set of records. First eliminate the sample that do not have at least 3 variables (col2). So, in the example, samples 4 and 5 get eliminated. Then form a matrix of values (col3) from the samples using only variables that are present accross all samples....

10. Shell Programming and Scripting

Match child with parents and form matrix

thank you for letting me join this forum, lots of learning opportunities looks like. Myself a biologist, very new into unix, so please excuse if I use incorrect language. I am using cygwin on windows, it can run perl, awk , sed etc. I have 2 files, the first sample sheet, tells which parent...

LEARN ABOUT REDHAT

sppequ

SPPEQU(l)								 )								 SPPEQU(l)

NAME

       SPPEQU  -  compute  row and column scalings intended to equilibrate a symmetric positive definite matrix A in packed storage and reduce its
       condition number (with respect to the two-norm)

SYNOPSIS

       SUBROUTINE SPPEQU( UPLO, N, AP, S, SCOND, AMAX, INFO )

	   CHARACTER	  UPLO

	   INTEGER	  INFO, N

	   REAL 	  AMAX, SCOND

	   REAL 	  AP( * ), S( * )

PURPOSE

       SPPEQU computes row and column scalings intended to equilibrate a symmetric positive definite matrix A in packed  storage  and  reduce  its
       condition number (with respect to the two-norm). S contains the scale factors, S(i)=1/sqrt(A(i,i)), chosen so that the scaled matrix B with
       elements B(i,j)=S(i)*A(i,j)*S(j) has ones on the diagonal.  This choice of S puts the condition number of B within a factor N of the small-
       est possible condition number over all possible diagonal scalings.

ARGUMENTS

       UPLO    (input) CHARACTER*1
	       = 'U':  Upper triangle of A is stored;
	       = 'L':  Lower triangle of A is stored.

       N       (input) INTEGER
	       The order of the matrix A.  N >= 0.

       AP      (input) REAL array, dimension (N*(N+1)/2)
	       The  upper or lower triangle of the symmetric matrix A, packed columnwise in a linear array.  The j-th column of A is stored in the
	       array AP as follows: if UPLO = 'U', AP(i + (j-1)*j/2) = A(i,j) for 1<=i<=j; if UPLO = 'L', AP(i	+  (j-1)*(2n-j)/2)  =  A(i,j)  for
	       j<=i<=n.

       S       (output) REAL array, dimension (N)
	       If INFO = 0, S contains the scale factors for A.

       SCOND   (output) REAL
	       If  INFO = 0, S contains the ratio of the smallest S(i) to the largest S(i).  If SCOND >= 0.1 and AMAX is neither too large nor too
	       small, it is not worth scaling by S.

       AMAX    (output) REAL
	       Absolute value of largest matrix element.  If AMAX is very close to overflow or very close  to  underflow,  the	matrix	should	be
	       scaled.

       INFO    (output) INTEGER
	       = 0:  successful exit
	       < 0:  if INFO = -i, the i-th argument had an illegal value
	       > 0:  if INFO = i, the i-th diagonal element is nonpositive.

LAPACK version 3.0						   15 June 2000 							 SPPEQU(l)