Venn Data Maker


Login or Register to Reply

 
Thread Tools Search this Thread
# 15  
Old 08-19-2016
Quote:
Originally Posted by jacobs.smith
You are a legend R.Singh. Thank you!
Thank you jsscob.smith, it is all Almigthy's grace on me. Glad it helped you, did you test all permutations and cominations here? As I haven't done that much testing in it, please confirm.

Also thank you for asking this question, please keep asking questions(trying by yourself too), wherever is WILL there is a PATH.

Thanks,
R. Singh

Last edited by RavinderSingh13; 08-19-2016 at 03:52 PM..
# 16  
Old 08-19-2016
Quote:
Originally Posted by jacobs.smith
.
.
.
Also - the number of lines in the intersectionlist.txt should be equal to = (2^(number of sets))-1
.
.
.
So, with 7 sets there should be 127 lines, no? And the sum of individual set counts should be equal to the No. of lines?

Should g2,1,1,0,1,1,1,1,1 from RavinderSingh13's example be in Set1245678 or in Set12, Set14, Set15, ..., Set78?

Last edited by RudiC; 08-19-2016 at 03:47 PM..
# 17  
Old 08-19-2016
Quote:
Originally Posted by RudiC
So, with 7 sets there should be 127 lines, no? And the sum of individual set counts should be equal to the No. of lines?

Should g2,1,1,0,1,1,1,1,1 from RavinderSingh13's example be in Set1245678 or in Set12, Set14, Set15, ..., Set78?
Hi R.Singh,

I checked it with the input file.

But the number of lines in the output.txt doesn't reach to be 127.

I guess, it is printing only the values where there is a common or unique set.

However, I would like to see all combination values.

Thanks
# 18  
Old 08-19-2016
OK, try this:
Code:
awk '
NR==1   {print "Name", $0
         CC = NF
         for (i=1; i<2^CC; i++) Set[i] = 0
         next
        }
        {for (i=1; i<=CC; i++)  {T[$i]
                                 R[$i,i] = 1
                                }
        }
END     {delete T[""]
         for (t in T)   {printf "%s", t
                         for (i=1; i<=CC; i++)  printf ",%s", R[t,i]+0
                         printf RS
                         TMP = 0
                         for (i=1; i<=CC; i++)  TMP = TMP + 2^(i-1)*R[t,i]
                         Set[TMP]++
                        }
         for (i=1; i<2^CC; i++) {printf "Set"; for (j=0; j<CC; j++) if (int(i/2^j)%2) printf "%d", j+1; printf "=%d%s", Set[i], RS }
        }
' FS=,  file
Name,Set1,Set2,Set3,Set4,S5,S6
g1,1,1,1,1,1,1
g2,1,1,0,0,1,1
g3,0,0,1,0,1,1
g4,1,0,0,1,0,0
g5,1,1,1,1,1,1
g6,1,1,0,0,1,1
g7,0,1,0,1,1,0
g8,0,0,1,1,0,0
Set1=0
Set2=0
Set12=0
Set3=0
Set13=0
Set23=0
Set123=0
Set4=0
Set14=1
Set24=0
Set124=0
Set34=1
.
.
.
Set145=0
Set245=1
Set1245=0
.
.
.
Set256=0
Set1256=2
Set356=1
Set1356=0
.
.
.
Set23456=0
Set123456=2

Lines: 63 ( = 2^6 -1 )
Sum (set-values): 8 (8 different genes)

A bit complicated as awk doesn't provide binary operations nor print formats.

Last edited by RudiC; 08-19-2016 at 06:37 PM..
These 2 Users Gave Thanks to RudiC For This Post:
jacobs.smith (08-22-2016) RavinderSingh13 (08-21-2016)
# 19  
Old 08-21-2016
Another small refinement:
Code:
awk '
NR==1   {print "Name," $0
         CC = NF
         for (i=1; i<2^CC; i++) Set[i] = 0
         next
        }

        {for (i=1; i<=CC; i++)  {T[$i]
                                 R[$i,i] = 1
                                }
        }

END     {delete T[""]
         for (t in T)   {TMP = 0
                         for (i=1; i<=CC; i++)  {printf "%s,%s%s", i==1?t:_, R[t,i]+0, i==CC?RS:_
                                                 TMP = TMP + 2^(i-1)*R[t,i]
                                                }
                         Set[TMP]++
                        }
         for (i=1; i<2^CC; i++) {TMP = 0
                                 for (j=0; j<CC; j++) if (int(i/2^j)%2) TMP = TMP * 10 + j+1
                                 printf "Set%d_%s=%d" RS, TMP, TMP<10?"unique":"common", Set[i] |  "sort -k1.4,1n"
                                }
        }
' FS=, file
Name,Set1,Set2,Set3,S4,S5
g1,1,1,1,1,1
g2,1,1,0,1,0
g3,0,0,1,1,0
g4,1,0,0,1,1
g5,0,1,1,1,1
g6,1,0,0,1,1
g7,0,1,0,1,1
g8,0,0,1,0,1
Set1_unique=0
Set2_unique=0
Set3_unique=0
Set4_unique=0
Set5_unique=0
Set12_common=0
Set13_common=0
Set14_common=0
Set15_common=0
Set23_common=0
Set24_common=0
Set25_common=0
Set34_common=1
Set35_common=1
Set45_common=0
Set123_common=0
Set124_common=1
Set125_common=0
Set134_common=0
Set135_common=0
Set145_common=2
Set234_common=0
Set235_common=0
Set245_common=1
Set345_common=0
Set1234_common=0
Set1235_common=0
Set1245_common=0
Set1345_common=0
Set2345_common=1
Set12345_common=1

This User Gave Thanks to RudiC For This Post:
jacobs.smith (08-22-2016)
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Venn diagram results using awk jacobs.smith Shell Programming and Scripting 15 07-25-2012 06:11 AM
maker flomper Programming 2 09-11-2002 08:52 PM