MATRIX to CSV Post: 302823637

Sponsored Content

Top Forums Shell Programming and Scripting MATRIX to CSV Post 302823637 by DGPickett on Wednesday 19th of June 2013 03:02:06 PM

06-19-2013

Registered User

The first thing that occurs to me is that the input is a columns-should-be-rows flavor, so turn it into proper tuples by making a row/line/tuple for each related. Then you can deal with it like a SQL RDBMS table. You can then cut, sort and uniq -c or while read the various columns to get ranks and statistics. For instance, assuming there are always 2 related and they are peers, here is a partial solution (forgot VALUE, discarded TOTFrequencyrelations f2 -- see if you can fix it):

Code:

(
  cut -d, -f 2,3,1 in_file
  cut -d, -f 2,4,1 in_file
 ) | (
  sort
  echo ZZZEND,ZZZEND
 ) | (
  IFS=${IFS}, lrusr=
  while read usr rusr pnz
   do
 
    if [ "$lusr" = "$usr" ]
     then
 
      (( uct++ ))
 
      if [ "$lrusr" = "$rusr" ]
       then
 
        (( ruct++ ))
 
       fi
     fi
 
    if [ "lusr-$lrusr" = "$usr-$rusr" ]
     then
 
      case "$pnz" in
       (-1)
         (( pnzn++ ))
        ;;
       (0)
         (( pnzz++ ))
        ;;
       (*)
         (( pnzp++ ))
        ;;
       esac
 
      continue
     fi
 
    if [ "$lrusr" != "" ]
     then
 
      echo "$uct,$ruct,$usr,$rusr,$pnzp,$pnzz,$pnn"
 
     fi
 
    if [ "$lrusr" = "ZZZEND" ]
     then
 
      break
 
     fi
 
    uct=1 ruct=1 pnzz=0 pnzp=0 pnzn=0 lusr=$usr lrusr=$rusr
   done
 ) | sort -nr -t, | cut -d, -f 3-7

This cuts out fields for related1 and then for related2 into a common stream and feeds it to a sort, creating a sorted tuple file on the stream. The while read loop counts the user and user+related counts for sort ordering, and the +1 p, -1 n and 0 z, spitting out counts when the key changes. A dummy input trailer makes the loop spit out the last set of values. Once sorted numerically, the counts for sorting are discarded.

Last edited by DGPickett; 06-19-2013 at 05:23 PM..

DGPickett

View Public Profile for DGPickett

Find all posts by DGPickett

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

diagonal matrix to square matrix

Hello, all! I am struggling with a short script to read a diagonal matrix for later retrieval. 1.000 0.234 0.435 0.123 0.012 0.102 0.325 0.412 0.087 0.098 1.000 0.111 0.412 0.115 0.058 0.091 0.190 0.045 0.058 1.000 0.205 0.542 0.335 0.054 0.117 0.203 0.125 1.000 0.587 0.159 0.357...

2. Shell Programming and Scripting

2 problems: Mailing CSV file / parsing CSV for display

I have been trying to find a good solution for this seemingly simple task for 2 days, and I'm giving up and posting a thread. I hope someone can help me out! I'm on HPUX, using sqlplus, mailx, awk, have some other tools available, but can't install stuff that isn't already in place (without a...

3. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ...

4. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row...

5. Shell Programming and Scripting

CSV to SQL insert: Awk for strings with multiple lines in csv

Hi Fellows, I have been struggling to fix an issue in csv records to compose sql statements and have been really losing sleep over it. Here is the problem: I have csv files in the following pipe-delimited format: Column1|Column2|Column3|Column4|NEWLINE Address Type|some descriptive...

6. Shell Programming and Scripting

Perl search csv fileA where two strings exist on another csv fileB

Hi I have two csv files, with the following formats: FileA.log: Application, This occured blah Application, That occured blah Application, Also this AnotherLog, Bob did this AnotherLog, Dave did that FileB.log: Uk, London, Application, datetime, LaterDateTime, Today it had'nt...

7. Shell Programming and Scripting

3 column .csv --> correlation matrix; awk, perl?

Greetings, salutations. I have a 3 column csv file with ~13 million rows and I would like to generate a correlation matrix. Interestingly, you all previously provided a solution to the inverse of this problem. Thread title: "awk? adjacency matrix to adjacency list / correlation matrix to list"...

8. Shell Programming and Scripting

Comparing 2 CSV files and sending the difference to a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 ...

9. Shell Programming and Scripting

Compare 2 csv files in ksh and o/p the difference in a new csv file

10. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312...

LEARN ABOUT MINIX

sort

SORT(1) 						      General Commands Manual							   SORT(1)

NAME

       sort - sort a file of ASCII lines

SYNOPSIS

       sort [-bcdfimnru] [-tc]	[-o name] [+pos1] [-pos2] file ...

OPTIONS

       -b     Skip leading blanks when making comparisons

       -c     Check to see if a file is sorted

       -d     Dictionary order: ignore punctuation

       -f     Fold upper case onto lower case

       -i     Ignore nonASCII characters

       -m     Merge presorted files

       -n     Numeric sort order

       -o     Next argument is output file

       -r     Reverse the sort order

       -t     Following character is field separator

       -u     Unique mode (delete duplicate lines)

EXAMPLES

       sort -nr file	   # Sort keys numerically, reversed

       sort +2 -4 file	   # Sort using fields 2 and 3 as key

       sort +2 -t: -o out  # Field separator is :

       sort +.3 -.6	   # Characters 3 through 5 form the key

DESCRIPTION

       Sort  sorts  one or more files.	If no files are specified, stdin is sorted.  Output is written on standard output, unless -o is specified.
       The options +pos1 -pos2 use only fields pos1 up to but not including pos2 as the sort key, where a field is a string of	characters  delim-
       ited  by  spaces and tabs, unless a different field delimiter is specified with -t.  Both pos1 and pos2 have the form m.n where m tells the
       number of fields and n tells the number of characters.  Either m or n may be omitted.

SEE ALSO

       comm(1), grep(1), uniq(1).

																	   SORT(1)