MATRIX to CSV


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting MATRIX to CSV
# 8  
Old 06-20-2013
Code:
$
$ cat input
VALUE,USER1,relatedUSER1,relatedUSER2
-1,userA,userB,userC
1,userN,userD,userB
0,userF,userH,userG
0,userT,userH,userB
1,userN,userB,userA
-1,userA,userF,userC
0,userF,userH,userU
1,userB,userJ,userU
-1,userJ,userD,userA
-1,userA,userB,userC
-1,userN,userD,userB
0,userN,userH,userG
$
$
$ perl -lne 'next if $. == 1;
             ($n, $u, $x, $y) = split/,/;
             $mtx{"$u,$x"}->[0]++;
             $mtx{"$u,$y"}->[0]++;
             if ($n > 0)    { $mtx{"$u,$x"}->[1]++; $mtx{"$u,$y"}->[1]++ }
             elsif ($n < 0) { $mtx{"$u,$x"}->[3]++; $mtx{"$u,$y"}->[3]++ }
             else           { $mtx{"$u,$x"}->[2]++; $mtx{"$u,$y"}->[2]++ }
             END {
               print "SUPERUSER,relatedUSER,TOTFrequencyrelations,(1)Frequency,(0)Frequency,(-1)Frequency,VALUE";
               for $k (sort keys %mtx) {
                 $pos = $mtx{$k}->[1] // 0;
                 $zro = $mtx{$k}->[2] // 0;
                 $ngt = $mtx{$k}->[3] // 0;
                 $val = ($pos + $ngt) == 0 ? 0 : ($pos - $ngt)/($pos + $ngt);
                 print "$k,$mtx{$k}->[0],$pos,$zro,$ngt,$val";
               }
             }
            ' input
SUPERUSER,relatedUSER,TOTFrequencyrelations,(1)Frequency,(0)Frequency,(-1)Frequency,VALUE
userA,userB,2,0,0,2,-1
userA,userC,3,0,0,3,-1
userA,userF,1,0,0,1,-1
userB,userJ,1,1,0,0,1
userB,userU,1,1,0,0,1
userF,userG,1,0,1,0,0
userF,userH,2,0,2,0,0
userF,userU,1,0,1,0,0
userJ,userA,1,0,0,1,-1
userJ,userD,1,0,0,1,-1
userN,userA,1,1,0,0,1
userN,userB,3,2,0,1,0.333333333333333
userN,userD,2,1,0,1,0
userN,userG,1,0,1,0,0
userN,userH,1,0,1,0,0
userT,userB,1,0,1,0,0
userT,userH,1,0,1,0,0
$
$

# 9  
Old 06-21-2013
Hello guys,


here the output of yoda code:

Code:
SUPERUSER,relatedUSER,TOTFrequencyrelations,(1)Frequency,(0)Frequency,(-1)Frequency,VALUE
userT,userH,1,0,1,0,0
userT,userB,1,0,1,0,0
userU,userB,1,1,0,0,1
userU,userF,1,0,1,0,0
userU,userH,1,0,1,0,0
userU,userJ,1,1,0,0,1
userA,userJ,1,0,0,1,-1
userA,userN,1,1,0,0,1
userA,userB,3,1,0,2,-0,3
userA,userC,3,0,0,3,-1
userA,userD,1,0,0,1,-1
userA,userF,1,0,0,1,-1
userB,userA,3,1,0,2,-0,3
userB,userC,2,0,0,2,-1
userB,userD,2,1,0,1,0
userB,userH,1,0,1,0,0
userB,userJ,1,1,0,0,1
userB,userN,3,2,0,1,0,3
userB,userT,1,0,1,0,0
userB,userU,1,1,0,0,1
userC,userA,3,0,0,3,-1
userC,userB,2,0,0,2,-1
userC,userF,1,0,0,1,-1
userD,userA,1,0,0,1,-1
userD,userB,2,1,0,1,0
userD,userJ,1,0,0,1,-1
userD,userN,2,1,0,1,0
userF,userA,1,0,0,1,-1
userF,userC,1,0,0,1,-1
userF,userG,1,0,1,0,0
userF,userH,2,0,2,0,0
userF,userU,1,0,1,0,0
userG,userF,1,0,1,0,0
userG,userH,2,0,2,0,0
userG,userN,1,0,1,0,0
userH,userB,1,0,1,0,0
userH,userF,2,0,2,0,0
userH,userG,2,0,2,0,0
userH,userN,1,0,1,0,0
userH,userT,1,0,1,0,0
userH,userU,1,0,1,0,0
userJ,userA,1,0,0,1,-1
userJ,userB,1,1,0,0,1
userJ,userD,1,0,0,1,-1
userJ,userU,1,1,0,0,1
userN,userA,1,1,0,0,1
userN,userB,3,2,0,1,0,3
userN,userD,2,1,0,1,0
userN,userG,1,0,1,0,0
userN,userH,1,0,1,0,0

DGPickett, you are right, doesn´t fit the model

many thanks for you attention

---------- Post updated at 02:37 AM ---------- Previous update was at 02:28 AM ----------

Hi durden_tyler,

here the output of your code:


Code:
String found where operator expected at script.pl line 19, near "'"
  (Might be a runaway multi-line '' string starting on line 2)
	(Missing semicolon on previous line?)
syntax error at script.pl line 19, near "'"
Execution of script.pl aborted due to compilation errors.

many thanks for your help!
# 10  
Old 06-21-2013
Quote:
Originally Posted by kraterions
...
Hi durden_tyler,

here the output of your code:

Code:
String found where operator expected at script.pl line 19, near "'"
  (Might be a runaway multi-line '' string starting on line 2)
    (Missing semicolon on previous line?)
syntax error at script.pl line 19, near "'"
Execution of script.pl aborted due to compilation errors.

many thanks for your help!
Hi kraterions,

My post does not involve any file called "script.pl". But yours does.
I ran a Perl one-liner on the shell prompt.
But you put my Perl one-liner inside a Perl script called "script.pl" and tried to execute that.
So, essentially you are trying to call the perl interpreter from within a Perl program.
Obviously, it doesn't work that way.

Thanks,
tyler_durden
# 11  
Old 06-21-2013
Hi durden_tyler,

ok perfect, your script works fine,

do you know how to use it as a script?

Code:
perl script.pl inputfile.csv > outputfile.csv

Thanks for your help!
# 12  
Old 06-21-2013
If you trim out some of the lines in Yoda's code, it should be fine. What was the answer on divide by zero for all zero interactions?
# 13  
Old 06-21-2013
Hi DGPickett,

yoda's coda doesn´t works fine as you can see in the output sample.

I guess that durden_tyler's code fits my needs.

Thanks for your attention.
# 14  
Old 06-22-2013
Code:
$
$ cat inputfile.csv
VALUE,USER1,relatedUSER1,relatedUSER2
-1,userA,userB,userC
1,userN,userD,userB
0,userF,userH,userG
0,userT,userH,userB
1,userN,userB,userA
-1,userA,userF,userC
0,userF,userH,userU
1,userB,userJ,userU
-1,userJ,userD,userA
-1,userA,userB,userC
-1,userN,userD,userB
0,userN,userH,userG
$
$ cat -n script.pl
     1  #!/usr/bin/perl -w
     2  # ----------------------------------------------------------------------------------------------------------
     3  # Note that the hash %mtx has keys that look like this: "X,Y", where X is USER1 in the input file and Y
     4  # is either relatedUSER1 or relatedUSER2. Each key value is a pointer to an array that looks like this:
     5  # [ n1, n2, n3, n4 ]
     6  # Here, n1 = relationship count, n2 = +ve relation count, n3 = neutral relation count and
     7  # n4 = -ve relation count
     8  # ----------------------------------------------------------------------------------------------------------
     9  use strict;
    10  my %mtx;                                                          # a hash for storing the matrix information
    11  # the header line
    12  my $header = "SUPERUSER,relatedUSER,TOTFrequencyrelations,(1)Frequency,(0)Frequency,(-1)Frequency,VALUE";
    13  my $file = $ARGV[0];                                              # first command-line arg is the file name
    14  open(FH, "<", $file) or die "Can't open $file: $!";               # try opening it, die if unsuccessful
    15  while (<FH>) {                                                    # loop through file
    16    next if $. == 1;                                                # skip the first line
    17    chomp;                                                          # remove the EOL character
    18    my ($n, $u, $x, $y) = split/,/;                                 # tokenize the line; assign local variables
    19    $mtx{"$u,$x"}->[0]++;                                           # tokens (2,3) and (2,4) are the keys
    20    $mtx{"$u,$y"}->[0]++;                                           # increment their relation counts
    21    if ($n > 0)    { $mtx{"$u,$x"}->[1]++; $mtx{"$u,$y"}->[1]++ }   # increment the counts of +ve/0/-ve relations
    22    elsif ($n < 0) { $mtx{"$u,$x"}->[3]++; $mtx{"$u,$y"}->[3]++ }   # To get a better idea of what the hash %mtx
    23    else           { $mtx{"$u,$x"}->[2]++; $mtx{"$u,$y"}->[2]++ }   # looks like, try Data::Dumper after file close
    24  }                                                                 # done looping
    25  close (FH) or die "Can't close $file: $!";                        # close the file or die if unsuccessful
    26  print $header, "\n";                                              # start with printing the header first
    27  for my $k (sort keys %mtx) {                                      # sort the hash keys and loop through them
    28    my $pos = $mtx{$k}->[1] // 0;                                   # if a relation count is undefined
    29    my $zro = $mtx{$k}->[2] // 0;                                   # or non-existent
    30    my $ngt = $mtx{$k}->[3] // 0;                                   # set it to 0
    31    my $val = ($pos + $ngt) == 0 ? 0 : ($pos - $ngt)/($pos + $ngt); # derive the value at the farthest position
    32    print "$k,$mtx{$k}->[0],$pos,$zro,$ngt,$val\n";                 # and print everything
    33  }                                                                 # we're done
$
$ perl script.pl inputfile.csv
SUPERUSER,relatedUSER,TOTFrequencyrelations,(1)Frequency,(0)Frequency,(-1)Frequency,VALUE
userA,userB,2,0,0,2,-1
userA,userC,3,0,0,3,-1
userA,userF,1,0,0,1,-1
userB,userJ,1,1,0,0,1
userB,userU,1,1,0,0,1
userF,userG,1,0,1,0,0
userF,userH,2,0,2,0,0
userF,userU,1,0,1,0,0
userJ,userA,1,0,0,1,-1
userJ,userD,1,0,0,1,-1
userN,userA,1,1,0,0,1
userN,userB,3,2,0,1,0.333333333333333
userN,userD,2,1,0,1,0
userN,userG,1,0,1,0,0
userN,userH,1,0,1,0,0
userT,userB,1,0,1,0,0
userT,userH,1,0,1,0,0
$
$

These 2 Users Gave Thanks to durden_tyler For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies

2. Shell Programming and Scripting

Compare 2 csv files in ksh and o/p the difference in a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 The... (7 Replies)
Discussion started by: Naresh101
7 Replies

3. Shell Programming and Scripting

Comparing 2 CSV files and sending the difference to a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 ... (1 Reply)
Discussion started by: Naresh101
1 Replies

4. Shell Programming and Scripting

3 column .csv --> correlation matrix; awk, perl?

Greetings, salutations. I have a 3 column csv file with ~13 million rows and I would like to generate a correlation matrix. Interestingly, you all previously provided a solution to the inverse of this problem. Thread title: "awk? adjacency matrix to adjacency list / correlation matrix to list"... (6 Replies)
Discussion started by: R3353
6 Replies

5. Shell Programming and Scripting

Perl search csv fileA where two strings exist on another csv fileB

Hi I have two csv files, with the following formats: FileA.log: Application, This occured blah Application, That occured blah Application, Also this AnotherLog, Bob did this AnotherLog, Dave did that FileB.log: Uk, London, Application, datetime, LaterDateTime, Today it had'nt... (8 Replies)
Discussion started by: PerlNewbRP
8 Replies

6. Shell Programming and Scripting

CSV to SQL insert: Awk for strings with multiple lines in csv

Hi Fellows, I have been struggling to fix an issue in csv records to compose sql statements and have been really losing sleep over it. Here is the problem: I have csv files in the following pipe-delimited format: Column1|Column2|Column3|Column4|NEWLINE Address Type|some descriptive... (4 Replies)
Discussion started by: khayal
4 Replies

7. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

8. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

9. Shell Programming and Scripting

2 problems: Mailing CSV file / parsing CSV for display

I have been trying to find a good solution for this seemingly simple task for 2 days, and I'm giving up and posting a thread. I hope someone can help me out! I'm on HPUX, using sqlplus, mailx, awk, have some other tools available, but can't install stuff that isn't already in place (without a... (6 Replies)
Discussion started by: soldstatic
6 Replies

10. Shell Programming and Scripting

diagonal matrix to square matrix

Hello, all! I am struggling with a short script to read a diagonal matrix for later retrieval. 1.000 0.234 0.435 0.123 0.012 0.102 0.325 0.412 0.087 0.098 1.000 0.111 0.412 0.115 0.058 0.091 0.190 0.045 0.058 1.000 0.205 0.542 0.335 0.054 0.117 0.203 0.125 1.000 0.587 0.159 0.357... (11 Replies)
Discussion started by: yifangt
11 Replies
Login or Register to Ask a Question