Sponsored Content
Top Forums Shell Programming and Scripting Awk: conversion of matrix formats Post 302780849 by Don Cragun on Friday 15th of March 2013 07:33:36 AM
Old 03-15-2013
Quote:
Originally Posted by dietmar13
names are of any length
output shoul be tab delimited
there are up to millions names
(I should have clarified this)
Hi dietmar,
If any names are more than 7 characters long (assuming tab stops set at every 8 column positions), your output headings won't line up with the data values. If this is a concern to you, you could change the script to print the top headings vertically instead of horizontally and adjust the printing of the row headings to make the 1st column in your output be the width of the longest name.

Furthermore, with millions of rows and columns, the output produced will not be a text file (due to excessive line lengths), so you will be restricted by the number of utilities you can use to post-process your output.
Quote:
Originally Posted by dietmar13
... ... ...
RudiC has already commented on the part removed above.
Quote:
Originally Posted by dietmar13

thank you very much.

dietmar

---------- Post updated at 02:39 AM ---------- Previous update was at 01:34 AM ----------

now the script works with one exception:

Code:
#!/bin/bash

fn=$1
fname=${fn%.*}
echo $fname

awk 'BEGIN {FS="\t"};
    NF >= 3    
    {HD[$1]++; HD[$3]++; PP[$1,$3] = 1; PP[$3,$1] = 1 }
    END    {printf "\t"
        for (i in HD) { printf "%s\t" ,i } printf "\n"
        for (i in HD) {printf "%s\t", i ;
            for (j in HD) { printf "%s\t", PP[i,j]?PP[i,j]:"0" } ;
            printf "\n" } }' $fn > $fname.adj

BUT: I get the complete input file in front of my matrix output file, and I don't see why this happens...
Assuming that there are no empty or blank lines in your input files, that column and row headings are less than 8 characters long (or you don't care about column alignment), and that you don't want the extra tab at the end of each line that your script currently produces, you could also try this slightly simplified script:
Code:
#!/bin/bash
fn=$1
fname=${fn%.*}
echo $fname

awk '{  HD[$1]; HD[$3]; PP[$1,$3] = PP[$3,$1] = 1}
END {   for (i in HD) printf "\t%s", i
        printf "\n"
        for (i in HD) {
                printf "%s", i
                for (j in HD) printf "\t%d", PP[i,j]?1:0
                printf "\n"
        }
}' $fn > $fname.adj

which puts:
Code:
	A	B	C	D	E
A	0	1	1	0	0
B	1	0	0	1	0
C	1	0	0	0	0
D	0	1	0	0	1
E	0	0	0	1	0

in your output file when the file named by $1 contains the input given in the 1st message in this thread. (Again as RudiC stated, the order of rows and columns may vary, but the row headings and the column headings should be in the same order.)

If you wanted to run this on a Solaris/SunOS system, you would need to use /usr/xpg4/bin/awk or nawk instead of awk.

Last edited by Don Cragun; 03-15-2013 at 08:35 AM.. Reason: s/empty/empty or blank/
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

need help-matrix inverse (awk)

I have few days to complete my awk homework. But I'm stucked. i hope some1 will help me out. I have to inverse n x n matrix, but I have problems with finding the determinant of the matrix. I found the algoritm, how to find a determinant of n x n matrix:... (0 Replies)
Discussion started by: vesyyr
0 Replies

2. Shell Programming and Scripting

matrix inverse (awk)

I need to inverse a matrix given in a file. The problem is I'm stuck with writing determinant finding algoritm into code. I found this algoritm about finding determinant of nxn matrix. This is what i need: Matrices and Determinants and here: a11 a12 a13 a21 a22 a23 a31 a32 a33... (0 Replies)
Discussion started by: vesyyr
0 Replies

3. Shell Programming and Scripting

awk matrix problem

hi there I'm very new in programing and i've started with awk. I'm processing 200 data files and I need to do some precessing on them. The files have 3 columns with N-lines for each line a have on the first and second value is the same for all the files and only the third is variable. like... (2 Replies)
Discussion started by: philstar
2 Replies

4. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

5. UNIX for Dummies Questions & Answers

tab-separated file to matrix conversion

hello all, i have an input file like that A A X0 A B X1 A C X2 ... A Z Xx B A X1 B B X3 .... Z A Xx Z B X4 and i want to have an output like that A B C D A X0 X1 X2 Xy B X1 X3 X4 (4 Replies)
Discussion started by: TheTransporter
4 Replies

6. Shell Programming and Scripting

Summing up a matrix using awk

Hi there, If anyone can help me sorting out this small task would be great. Given a matrix like the following: 100 3 3 3 3 3 ... 200 5 5 5 5 5 ... 400 1 1 1 1 1 ... 500 8 8 8 8 8 ... 900 0 0 0 0... (5 Replies)
Discussion started by: JRodrigoF
5 Replies

7. Shell Programming and Scripting

conversion: 3 columns into matrix

Hi guys, here https://www.unix.com/shell-programming-scripting/193043-3-column-csv-correlation-matrix-awk-perl.html I found awk script converting awk '{ OFS = ";" if (t) { if (l != $1) t = t OFS $1 } else t = OFS $1 x = x ? x OFS $NF : $NF l = $1 }... (2 Replies)
Discussion started by: grincz
2 Replies

8. Shell Programming and Scripting

how to rearrange a matrix with awk

Hi, every one. I have two files ,one is in matrix like this, one is a list with the same data as the matrix. AB AE AC AD AA AF SA 3 4 5 6 4 6 SC 5 7 2 8 4 3 SD 4 6 5 3 8 3 SE 45 ... (5 Replies)
Discussion started by: xshang
5 Replies

9. Shell Programming and Scripting

Using awk to parse a file with mixed formats in columns

Greetings I have a file formatted like this: rhino grey weight=1003;height=231;class=heaviest;histology=9,0,0,8 bird white weight=23;height=88;class=light;histology=7,5,1,0,0 turtle green weight=40;height=9;class=light;histology=6,0,2,0... (2 Replies)
Discussion started by: Twinklefingers
2 Replies

10. Shell Programming and Scripting

How to sum the matrix using awk?

input A1 B1 A2 B2 0 0 1 1 1 0 0 1 0 1 1 0 1 1 1 1 Output label A1 B1 A2 B2 A1 2 1 1 2 B1 1 2 2 1 A2 1 2 3 2 B2 2 1 2 3 Ex: The number of times that A1 and B1 row values are both 1 should be printed as output. The last row of A1 and B1 in the input match by having 1 in both... (4 Replies)
Discussion started by: quincyjones
4 Replies
All times are GMT -4. The time now is 07:41 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy