Converting to matrix-like file using AWK


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Converting to matrix-like file using AWK
# 1  
Old 10-11-2012
Converting to matrix-like file using AWK

Hi,
Needs for statistics, doing converting

Here is a sample file
Input :
HTML Code:
1|A|17,94
1|B|22,59
1|C|56,93
2|A|63,71
2|C|23,92
5|B|19,49
5|C|67,58
expecting something like that
Output :
HTML Code:
1|A|17,94|B|22,59|C|56,93
2|A|63,71|B|0|C|23,92
5|A|0|B|19,49|C|67,58
I couldn't figure out how to do it.
So if there's a way using awk,ksh,bash or perl, any help will be appreciated.

Thank's

Last edited by fastlane3000; 10-11-2012 at 07:28 PM.. Reason: fix example's format
# 2  
Old 10-11-2012
If your input file is sorted such that the records with the same first field are always together, then this should work:

Code:
awk -F \| '
    $1 != last {
        if( hold )
            print hold;
        hold = $1;
    }
    {
        hold = hold "|" $2 "|" $3;
        last = $1;
    }
    END { print hold; }
' input-file >output-file

This User Gave Thanks to agama For This Post:
# 3  
Old 10-11-2012
Thank's agama,
I can sort the file even it's not (using 'sort' command)
i tried your solution, but i got this
Output :
HTML Code:
1|A|17,94|B|22,59|C|56,93
2|A|63,71|C|23,92
5|B|19,49|C|67,58
it only misses the zeros when one of the column value (A,B,C) is missing
# 4  
Old 10-11-2012
This will fill in the missing zeros. This doesn't require the input to be sorted, but it does require all entries in the input with the same value for the first field to be adjacent:
Code:
awk -F "|" 'BEGIN{reset()}
function doprint(){
        printf("%s|A|%s|B|%s|C|%s\n", last, val["A"], val["B"], val["C"])
        reset()
}
function reset(){
        val["A"] = val["B"] = val["C"] = 0
}
{       if($1 != last) if(last != "") doprint()
        last = $1
        val[$2] = $3
}
END{    if(last != "") doprint()
}' in

This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 10-11-2012
My fault -- failed to see the zeros.

Have a go with this:

Code:
awk -F \|  '
    BEGIN { lastc = "C";  }      # change if there are more than three rows/type
    function print_row(     c, cv )
    {
        cv = 65;
        c = "A";
        printf( "%s|", last );
        while( c <= lastc )
        {
            if( last " " c in data )
                printf( "|%s|%s", c, data[last " " c] );
            else
                printf( "|%s|0", c );
            cv++;
            c = sprintf( "%c", cv );
        }
        printf( "\n" );
    }
    {
        if( $1 != last )
        {
            print_row();
            delete( data );
        }

        data[$1" "$2] = $3;
        last = $1
    }
    END { print_row(); }
' input >output

It assumes that the second field is a single character in the range of A-Z.
# 6  
Old 10-12-2012
i really appreciate your help, this forum is really the best

sorry Agama, this one also doesn't work,
Output :
HTML Code:
||A|0|B|0|C|0
1||A|17,94|B|22,59|C|56,93
2||A|63,71|B|0|C|23,92
5||A|0|B|19,49|C|67,58
Don Cragun's solution worked like a charm

i just wondered if we can make it work for a 2nd field having more than 3 values other than A,B or C .
# 7  
Old 10-12-2012
A slight modification to Don's solution to meet your requirements:
Code:
awk -F "|" 'BEGIN{reset()}
function doprint(    c,i){
        printf("%s",last)
        for(i=65;i<=90;i++)
	{
	 c=sprintf("%c",i)
	 printf("|%s|%s",c,val[c])
	}
	printf "\n"
        #printf("%s|A|%s|B|%s|C|%s\n", last, val["A"], val["B"], val["C"])
        reset()
}
function reset(    i){
        for(i=65;i<=90;i++)
	 val[sprintf("%c",i)]=0
        #val["A"] = val["B"] = val["C"] = 0
}
{       if($1 != last) if(last != "") doprint()
        last = $1
        val[$2] = $3
}
END{    if(last != "") doprint()
}' in

This is assuming that only the alphabets A-Z (single character) will occur in the second field.
Change the loop condition according to your requirements.

Last edited by elixir_sinari; 10-12-2012 at 07:42 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to sum the matrix using awk?

input A1 B1 A2 B2 0 0 1 1 1 0 0 1 0 1 1 0 1 1 1 1 Output label A1 B1 A2 B2 A1 2 1 1 2 B1 1 2 2 1 A2 1 2 3 2 B2 2 1 2 3 Ex: The number of times that A1 and B1 row values are both 1 should be printed as output. The last row of A1 and B1 in the input match by having 1 in both... (4 Replies)
Discussion started by: quincyjones
4 Replies

2. Shell Programming and Scripting

[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos cat input_file tag pos atg 10 ata 16 agt 15 agg 19 atg 17 agg 14 I have used following command to sort the file based on second column sort -k 2 input_file tag pos atg 10 agg 14 agt 15 ata 16 agg 19 atg 17 (2 Replies)
Discussion started by: raj_k
2 Replies

3. Shell Programming and Scripting

awk to log transform on matrix file

Hi Friends, I have an input matrix file like this Col1 Col2 Col3 Col4 R1 1 2 3 4 R2 4 5 6 7 R3 5 6 7 8 I would like to consider only the numeric values without touching the column header and the row header. I looked up on the forum's search, and I found this. But, I donno how to... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

4. Shell Programming and Scripting

Converting text file in a matrix

Hi All, I do have a file with many lines (rows) and it is space delimited. For example: I have a file named SR345_pl.txt. If I open it in an editor, it looks like this: adfr A2 0.9345 dtgr/2 A2 0.876 fgh/3 A2 023.76 fghe/4 A2 2345 bnhy/1 A3 3456 bhy A3 0.9876 phy A5 0.987 kdrt A5... (9 Replies)
Discussion started by: Lucky Ali
9 Replies

5. Programming

Converting columns to matrix

Dear All I would like to convert columns to matrix For example my data looks like this D2 0 D2 0 1.0 D2 0 D2 1 0.308 D2 0 D2 2 0.554 D2 0 D2 3 0.287 D2 0 D2 4 0.633 D2 0 D2 5 0.341 D2 0 D2 6 0.665 D2 0 D2 7 0.698 D2 0 D2 8 0.625 D2 0 D2 9 0.429 D2 0 D2 10 0.698 D2 0 D2 11... (7 Replies)
Discussion started by: bala06
7 Replies

6. Shell Programming and Scripting

how to rearrange a matrix with awk

Hi, every one. I have two files ,one is in matrix like this, one is a list with the same data as the matrix. AB AE AC AD AA AF SA 3 4 5 6 4 6 SC 5 7 2 8 4 3 SD 4 6 5 3 8 3 SE 45 ... (5 Replies)
Discussion started by: xshang
5 Replies

7. Shell Programming and Scripting

Converting txt file into CSV using awk or sed

Hello folks I have a txt file of information about journal articles from different fields. I need to convert this information into a format that is easier for computers to manipulate for some research that I'm doing on how articles are cited. The file has some header information and then details... (8 Replies)
Discussion started by: ksk
8 Replies

8. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

9. Shell Programming and Scripting

Converting an scim .bin.user file to a stardict tab file possible with awk?

Hi all, Here is a scim sample.bin.user file a string1 0 a string2 0 a string3 63 b string4 126 c string5 315 d string6 0 e string7 63 e string8 126 f string9 0 I like to convert this into a dict.tab file to be compiled by the ... (4 Replies)
Discussion started by: hk008
4 Replies

10. Programming

Converting distance list to distance matrix in R

Hi power user, I have this type of data (distance list): file1 A B 10 B C 20 C D 50I want output like this # A B C D A 0 10 30 80 B 10 0 20 70 C 30 20 0 50 D 80 70 50 0 Which is a distance matrix I have tried... (0 Replies)
Discussion started by: anjas
0 Replies
Login or Register to Ask a Question