Venn Data Maker


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Homework and Emergencies Emergency UNIX and Linux Support Venn Data Maker
# 15  
Quote:
Originally Posted by jacobs.smith
You are a legend R.Singh. Thank you!
Thank you jsscob.smith, it is all Almigthy's grace on me. Glad it helped you, did you test all permutations and cominations here? As I haven't done that much testing in it, please confirm.

Also thank you for asking this question, please keep asking questions(trying by yourself too), wherever is WILL there is a PATH.

Thanks,
R. Singh

Last edited by RavinderSingh13; 08-19-2016 at 05:52 PM..
# 16  
Quote:
Originally Posted by jacobs.smith
.
.
.
Also - the number of lines in the intersectionlist.txt should be equal to = (2^(number of sets))-1
.
.
.
So, with 7 sets there should be 127 lines, no? And the sum of individual set counts should be equal to the No. of lines?

Should g2,1,1,0,1,1,1,1,1 from RavinderSingh13's example be in Set1245678 or in Set12, Set14, Set15, ..., Set78?

Last edited by RudiC; 08-19-2016 at 05:47 PM..
# 17  
Quote:
Originally Posted by RudiC
So, with 7 sets there should be 127 lines, no? And the sum of individual set counts should be equal to the No. of lines?

Should g2,1,1,0,1,1,1,1,1 from RavinderSingh13's example be in Set1245678 or in Set12, Set14, Set15, ..., Set78?
Hi R.Singh,

I checked it with the input file.

But the number of lines in the output.txt doesn't reach to be 127.

I guess, it is printing only the values where there is a common or unique set.

However, I would like to see all combination values.

Thanks
# 18  
OK, try this:
Code:
awk '
NR==1   {print "Name", $0
         CC = NF
         for (i=1; i<2^CC; i++) Set[i] = 0
         next
        }
        {for (i=1; i<=CC; i++)  {T[$i]
                                 R[$i,i] = 1
                                }
        }
END     {delete T[""]
         for (t in T)   {printf "%s", t
                         for (i=1; i<=CC; i++)  printf ",%s", R[t,i]+0
                         printf RS
                         TMP = 0
                         for (i=1; i<=CC; i++)  TMP = TMP + 2^(i-1)*R[t,i]
                         Set[TMP]++
                        }
         for (i=1; i<2^CC; i++) {printf "Set"; for (j=0; j<CC; j++) if (int(i/2^j)%2) printf "%d", j+1; printf "=%d%s", Set[i], RS }
        }
' FS=,  file
Name,Set1,Set2,Set3,Set4,S5,S6
g1,1,1,1,1,1,1
g2,1,1,0,0,1,1
g3,0,0,1,0,1,1
g4,1,0,0,1,0,0
g5,1,1,1,1,1,1
g6,1,1,0,0,1,1
g7,0,1,0,1,1,0
g8,0,0,1,1,0,0
Set1=0
Set2=0
Set12=0
Set3=0
Set13=0
Set23=0
Set123=0
Set4=0
Set14=1
Set24=0
Set124=0
Set34=1
.
.
.
Set145=0
Set245=1
Set1245=0
.
.
.
Set256=0
Set1256=2
Set356=1
Set1356=0
.
.
.
Set23456=0
Set123456=2

Lines: 63 ( = 2^6 -1 )
Sum (set-values): 8 (8 different genes)

A bit complicated as awk doesn't provide binary operations nor print formats.

Last edited by RudiC; 08-19-2016 at 08:37 PM..
These 2 Users Gave Thanks to RudiC For This Post:
# 19  
Another small refinement:
Code:
awk '
NR==1   {print "Name," $0
         CC = NF
         for (i=1; i<2^CC; i++) Set[i] = 0
         next
        }

        {for (i=1; i<=CC; i++)  {T[$i]
                                 R[$i,i] = 1
                                }
        }

END     {delete T[""]
         for (t in T)   {TMP = 0
                         for (i=1; i<=CC; i++)  {printf "%s,%s%s", i==1?t:_, R[t,i]+0, i==CC?RS:_
                                                 TMP = TMP + 2^(i-1)*R[t,i]
                                                }
                         Set[TMP]++
                        }
         for (i=1; i<2^CC; i++) {TMP = 0
                                 for (j=0; j<CC; j++) if (int(i/2^j)%2) TMP = TMP * 10 + j+1
                                 printf "Set%d_%s=%d" RS, TMP, TMP<10?"unique":"common", Set[i] |  "sort -k1.4,1n"
                                }
        }
' FS=, file
Name,Set1,Set2,Set3,S4,S5
g1,1,1,1,1,1
g2,1,1,0,1,0
g3,0,0,1,1,0
g4,1,0,0,1,1
g5,0,1,1,1,1
g6,1,0,0,1,1
g7,0,1,0,1,1
g8,0,0,1,0,1
Set1_unique=0
Set2_unique=0
Set3_unique=0
Set4_unique=0
Set5_unique=0
Set12_common=0
Set13_common=0
Set14_common=0
Set15_common=0
Set23_common=0
Set24_common=0
Set25_common=0
Set34_common=1
Set35_common=1
Set45_common=0
Set123_common=0
Set124_common=1
Set125_common=0
Set134_common=0
Set135_common=0
Set145_common=2
Set234_common=0
Set235_common=0
Set245_common=1
Set345_common=0
Set1234_common=0
Set1235_common=0
Set1245_common=0
Set1345_common=0
Set2345_common=1
Set12345_common=1

This User Gave Thanks to RudiC For This Post:
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #691
Difficulty: Medium
UnixWare is a Unix operating system originally released by Univel.
True or False?

2 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Venn diagram results using awk

Hi, I have the following files 1.txt a 10 b 11 c 12 d 13 e 14 f 15 g 16 h 17 i 18 j 19 k 20 2.txt a 21 b 22 (15 Replies)
Discussion started by: jacobs.smith
15 Replies

2. Programming

maker

how can i remake a program to crash a harddrive using unix:rolleyes: (2 Replies)
Discussion started by: flomper
2 Replies

Featured Tech Videos