Converting to matrix-like file using AWK


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Converting to matrix-like file using AWK
# 8  
Old 10-12-2012
that's perfect
just one more thing before closing this thread, can someone explain me how you choose the values 65 and 90
# 9  
Old 10-12-2012
65 is the ASCII code for A and 90 for Z.
So, 65-90 corresponds to A-Z.
# 10  
Old 10-12-2012
sorry didn't make a clear example
2nd column could have alphabetic values [A-Za-z] but also numeric [0-9] and even both alphanumeric,
2nd column have a variable length eg :Ag, Gde, vi3.... etc
however the list of 2nd column's values is limited and knowable before processing the file
# 11  
Old 10-12-2012
Quote:
Originally Posted by fastlane3000
sorry didn't make a clear example
2nd column could have alphabetic values [A-Za-z] but also numeric [0-9] and even both alphanumeric,
2nd column have a variable length eg :Ag, Gde, vi3.... etc
however the list of 2nd column's values is limited and knowable before processing the file
Here is one way to do it. This script doesn't require sorted input. The required format for the file specifying the names of the columns in the output matrix is described in the comments:
Code:
#!/bin/ksh
# tester -- use awk script to produce matrix of processed input records
#
# Usage tester [ input [ choices ]]
#       This utility will ready a list of entries from the file named by
#       "input" and print those entries as a vertical-bar character ('|') 
#       separated matrix.  The order of rows in the matrix is determined by the
#       first field in the "input" file.  The order of columns in the matrix is 
#       determined by the order of entries in "choices".  Entries in the matrix
#       not found in "input" will be displayed as "0".
#
# Operands:
#       choices Name of file containing a list of expected values for
#               "column-ID" values in the file named by input.  Each line is
#               assumed to a value-ID.  The order of entries in this file
#               determines the order in which columns in the output matrix
#               appear.  If this operand is not present or if is is an empty
#               string, a file named "choices" will be used by default.
#       input   Name of file containing matrix entry input.  Each entry is
#               assumed to be in the form:
#                       row-ID"|"column-ID"|"data
#               The order of entries in this file does not matter, except that
#               "row-ID" values in the output will be in the same order as the
#               first entry for each different "row-ID" value found in this
#               file.  If this operand is not present or if is is an empty
#               string, a file named "in" will be used by default.
#       All whitespace characters in entries in both files are significant.
#
# Exit Codes:
#       0 - successful completion
#       1 - one or more entries in "input" contained a "column-ID" value not
#           found in "choices".
awk 'BEGIN{     FS = SUBSEP = "|"}
FNR==NR{# Get choices for 2nd file 2nd column input from 1st file.
        s2[++s2c] = $1
        cfn = FILENAME
        next
}
{       # Add an entry from the 2nd file...
        val[$1,$2] = $3
        if(!($1 in list)) {
                # This is a new field 1 value.  Store cross reference recording
                # the values and the order in which they first appear in the
                # 2nd file.
                s1[++s1c] = $1
                list[$1] = s1c
        }
}
END{    for(i = 1; i <= s1c; i++) {
                printf("%s", s1[i])
                for(j = 1; j <= s2c; j++) {
                        # Print the saved values and delete them.
                        printf("%s%s%s%s", FS, s2[j], FS,
                                val[s1[i],s2[j]] == "" ? 0 : val[s1[i],s2[j]])
                        delete val[s1[i],s2[j]]
                }
                printf("\n")
        }
        # Any entries left in val[i,j] at this point come from lines in the 2nd
        # file with a 2nd field that was not listed in the 1st file...
        for(i in val) {
                errors++
                printf("*** Input: \"%s|%s\" has 2nd field not found in %s\n",
                        i, val[i], cfn)
        }
        exit errors > 0
}' ${2:-choices} ${1:-in}

This User Gave Thanks to Don Cragun For This Post:
# 12  
Old 10-13-2012
Very Impressive, always thought that 'awk' was powerful but didn't how much it was true.
Tested with the example but also with an other file.
It worked perfectly
Thank's Don Cragun
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to sum the matrix using awk?

input A1 B1 A2 B2 0 0 1 1 1 0 0 1 0 1 1 0 1 1 1 1 Output label A1 B1 A2 B2 A1 2 1 1 2 B1 1 2 2 1 A2 1 2 3 2 B2 2 1 2 3 Ex: The number of times that A1 and B1 row values are both 1 should be printed as output. The last row of A1 and B1 in the input match by having 1 in both... (4 Replies)
Discussion started by: quincyjones
4 Replies

2. Shell Programming and Scripting

[Solved] Converting the data into matrix with 0's and 1's

I have a file that contains 2 columns tag,pos cat input_file tag pos atg 10 ata 16 agt 15 agg 19 atg 17 agg 14 I have used following command to sort the file based on second column sort -k 2 input_file tag pos atg 10 agg 14 agt 15 ata 16 agg 19 atg 17 (2 Replies)
Discussion started by: raj_k
2 Replies

3. Shell Programming and Scripting

awk to log transform on matrix file

Hi Friends, I have an input matrix file like this Col1 Col2 Col3 Col4 R1 1 2 3 4 R2 4 5 6 7 R3 5 6 7 8 I would like to consider only the numeric values without touching the column header and the row header. I looked up on the forum's search, and I found this. But, I donno how to... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

4. Shell Programming and Scripting

Converting text file in a matrix

Hi All, I do have a file with many lines (rows) and it is space delimited. For example: I have a file named SR345_pl.txt. If I open it in an editor, it looks like this: adfr A2 0.9345 dtgr/2 A2 0.876 fgh/3 A2 023.76 fghe/4 A2 2345 bnhy/1 A3 3456 bhy A3 0.9876 phy A5 0.987 kdrt A5... (9 Replies)
Discussion started by: Lucky Ali
9 Replies

5. Programming

Converting columns to matrix

Dear All I would like to convert columns to matrix For example my data looks like this D2 0 D2 0 1.0 D2 0 D2 1 0.308 D2 0 D2 2 0.554 D2 0 D2 3 0.287 D2 0 D2 4 0.633 D2 0 D2 5 0.341 D2 0 D2 6 0.665 D2 0 D2 7 0.698 D2 0 D2 8 0.625 D2 0 D2 9 0.429 D2 0 D2 10 0.698 D2 0 D2 11... (7 Replies)
Discussion started by: bala06
7 Replies

6. Shell Programming and Scripting

how to rearrange a matrix with awk

Hi, every one. I have two files ,one is in matrix like this, one is a list with the same data as the matrix. AB AE AC AD AA AF SA 3 4 5 6 4 6 SC 5 7 2 8 4 3 SD 4 6 5 3 8 3 SE 45 ... (5 Replies)
Discussion started by: xshang
5 Replies

7. Shell Programming and Scripting

Converting txt file into CSV using awk or sed

Hello folks I have a txt file of information about journal articles from different fields. I need to convert this information into a format that is easier for computers to manipulate for some research that I'm doing on how articles are cited. The file has some header information and then details... (8 Replies)
Discussion started by: ksk
8 Replies

8. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

9. Shell Programming and Scripting

Converting an scim .bin.user file to a stardict tab file possible with awk?

Hi all, Here is a scim sample.bin.user file a string1 0 a string2 0 a string3 63 b string4 126 c string5 315 d string6 0 e string7 63 e string8 126 f string9 0 I like to convert this into a dict.tab file to be compiled by the ... (4 Replies)
Discussion started by: hk008
4 Replies

10. Programming

Converting distance list to distance matrix in R

Hi power user, I have this type of data (distance list): file1 A B 10 B C 20 C D 50I want output like this # A B C D A 0 10 30 80 B 10 0 20 70 C 30 20 0 50 D 80 70 50 0 Which is a distance matrix I have tried... (0 Replies)
Discussion started by: anjas
0 Replies
Login or Register to Ask a Question