Venn Data Maker


Login or Register to Reply

 
Thread Tools Search this Thread
# 8  
Old 08-19-2016
And the second question?
# 9  
Old 08-19-2016
Quote:
Originally Posted by RudiC
And the second question?
I am loving your questions.

Glad to learn.

Here is a way I tried. But it has two disadvantages.

One - I can only do three sets. But my actual input has 7 and even more.

Two - I cannot write the first column saying unique or common.

Code:
for i in 100 010 001 110 101 011 111; do awk -F"," 'NR>1 {print $2$3$4}' 1 | grep $i | wc -l;done

Thanks
# 10  
Old 08-19-2016
How about
Code:
awk '
NR==1   {print "Name", $0
         next
        }
        {for (i=1; i<=3; i++)   {T[$i]
                                 R[$i,i] = 1
                                }
        }
END     {delete T[""]
         for (t in T)   {print t, R[t,1]+0, R[t,2]+0, R[t,3]+0
                         TMP = R[t,1] * 100 + R[t,2] * 10 + R[t,3]
                         if (TMP == 111) Set123++
                         if (TMP == 110) Set12++
                         if (TMP == 101) Set13++
                         if (TMP == 11)  Set23++
                         if (TMP == 100) Set1++
                         if (TMP == 10)  Set2++
                         if (TMP == 1)   Set3++
                        }
         print "Set1_unique="   0+Set1
         print "SDet2_unique="  0+Set2
         print "Set3_unique="   0+Set3
         print "Set12_common="  0+Set12
         print "Set13_common="  0+Set13
         print "Set23_common="  0+Set23
         print "Set123_common=" 0+Set123
        }
' FS=, OFS=, file
Name,Set1,Set2,Set3
g1,1,1,1
g2,1,1,0
g3,0,0,1
g4,1,0,0
g5,0,1,1
g6,1,0,0
g7,0,1,0
g8,0,0,1
Set1_unique=2
SDet2_unique=1
Set3_unique=2
Set12_common=1
Set13_common=0
Set23_common=1
Set123_common=1

This User Gave Thanks to RudiC For This Post:
jacobs.smith (08-19-2016)
# 11  
Old 08-19-2016
Thanks Rudic.

But I would like to make it dynamic.

The post example has 3 sets. But my actual input file has numerous sets.

Can you please share any comments on how I can edit your solution?
# 12  
Old 08-19-2016
That's not that easy, as the number of combinations grows dramatically with increasing set count.
This User Gave Thanks to RudiC For This Post:
jacobs.smith (08-19-2016)
# 13  
Old 08-19-2016
Quote:
Originally Posted by jacobs.smith
Thanks Rudic.
But I would like to make it dynamic.
The post example has 3 sets. But my actual input file has numerous sets.
Can you please share any comments on how I can edit your solution?
Hello jacobs.smith,

Let's say we have following Input_file(which willbe created by your 1st requirement, so I have edited it to test it more).
Code:
cat Input_file
Name,Set1,Set2,Set3,Set4,Set5,Set6,Set7,Set8
g5,0,1,1,1,0,1,1,0
g6,1,0,0,0,0,0,0,0
g7,0,1,0,0,0,0,0,1
g8,0,0,1,1,1,1,0,1
g1,1,1,1,0,1,1,1,0
g2,1,1,0,1,1,1,1,1
g3,0,0,1,0,0,0,1,1
g4,1,0,0,1,0,0,1,1

Then following code may help you in same.
Code:
awk -F, 'NR==1{
		next
              } 
              {
		for(i=2;i<=NF;i++){
					for(j=i+1;j<=NF;j++){
								if($i==$j && $i!=0 && $j!=0){
												S["Set"(i-1)(j-1)"_common"]++;
											    };
                                                            }
                                  }
              } 
              {
                for(q=2;q<=NF;q++){
					if($q==1)           {
								num=q-1;
								E++
							    }
                                  };
                if(E==1)          {
					Y["Set"num"_unique"]++
				  };
		E=""
              } 
         END  {
		for(i in S){
				print i OFS S[i]
			   }
                for(u in Y){
				print u OFS Y[u]
                           }
              }
         '   Input_file

Then output will be as follows:
Code:
Set28_common 2
Set27_common 3
Set18_common 2
Set45_common 2
Set35_common 2
Set36_common 3
Set34_common 2
Set78_common 3
Set25_common 2
Set26_common 3
Set24_common 2
Set17_common 3
Set16_common 2
Set15_common 2
Set68_common 2
Set58_common 2
Set23_common 2
Set14_common 2
Set13_common 1
Set12_common 2
Set67_common 3
Set57_common 2
Set56_common 3
Set48_common 3
Set47_common 3
Set46_common 3
Set38_common 2
Set37_common 3
Set1_unique 1

Now you could make All in All command as follows, which you could run with original Input_file(posted in POST#1)
Code:
awk -F, 'NR==1{
                print "Name," $0;
                R=NF
              }
         NR>1 {
                for(i=1;i<=NF;i++){
                                        A[$i,i]++;
                                        if($i){
                                                C[$i]
                                              }
                                  }
              }
         END  {
                for(i in C)       {
                                        for(j=1;j<=R;j++){
                                                                Q=Q?Q FS (A[i,j]=A[i,j]>=1?1:0):i FS  (A[i,j]=A[i,j]>=1?1:0)};
                                                                print Q;
                                                                Q=""
                                                         }
              }
         ' Input_file   |   awk -F, 'NR==1{
		next
              } 
              {
		for(i=2;i<=NF;i++){
					for(j=i+1;j<=NF;j++){
								if($i==$j && $i!=0 && $j!=0){
												S["Set"(i-1)(j-1)"_common"]++;
											    };
                                                            }
                                  }
              } 
              {
                for(q=2;q<=NF;q++){
					if($q==1)           {
								num=q-1;
								E++
							    }
                                  };
                if(E==1)          {
					Y["Set"num"_unique"]++
				  };
		E=""
              } 
         END  {
		for(i in S){
				print i OFS S[i]
			   }
                for(u in Y){
				print u OFS Y[u]
                           }
              }
         '

Output will be as follows(as per Input_file into your POST#1).
Code:
Set23_common 2
Set13_common 1
Set12_common 2
Set1_unique 2
Set3_unique 2
Set2_unique 1

Please let me know if this helps you, will be glad.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
jacobs.smith (08-19-2016)
# 14  
Old 08-19-2016
Quote:
Originally Posted by RavinderSingh13
Hello jacobs.smith,

Let's say we have following Input_file(which willbe created by your 1st requirement, so I have edited it to test it more).
Code:
cat Input_file
Name,Set1,Set2,Set3,Set4,Set5,Set6,Set7,Set8
g5,0,1,1,1,0,1,1,0
g6,1,0,0,0,0,0,0,0
g7,0,1,0,0,0,0,0,1
g8,0,0,1,1,1,1,0,1
g1,1,1,1,0,1,1,1,0
g2,1,1,0,1,1,1,1,1
g3,0,0,1,0,0,0,1,1
g4,1,0,0,1,0,0,1,1

Then following code may help you in same.
Code:
awk -F, 'NR==1{
		next
              } 
              {
		for(i=2;i<=NF;i++){
					for(j=i+1;j<=NF;j++){
								if($i==$j && $i!=0 && $j!=0){
												S["Set"(i-1)(j-1)"_common"]++;
											    };
                                                            }
                                  }
              } 
              {
                for(q=2;q<=NF;q++){
					if($q==1)           {
								num=q-1;
								E++
							    }
                                  };
                if(E==1)          {
					Y["Set"num"_unique"]++
				  };
		E=""
              } 
         END  {
		for(i in S){
				print i OFS S[i]
			   }
                for(u in Y){
				print u OFS Y[u]
                           }
              }
         '   Input_file

Then output will be as follows:
Code:
Set28_common 2
Set27_common 3
Set18_common 2
Set45_common 2
Set35_common 2
Set36_common 3
Set34_common 2
Set78_common 3
Set25_common 2
Set26_common 3
Set24_common 2
Set17_common 3
Set16_common 2
Set15_common 2
Set68_common 2
Set58_common 2
Set23_common 2
Set14_common 2
Set13_common 1
Set12_common 2
Set67_common 3
Set57_common 2
Set56_common 3
Set48_common 3
Set47_common 3
Set46_common 3
Set38_common 2
Set37_common 3
Set1_unique 1

Now you could make All in All command as follows, which you could run with original Input_file(posted in POST#1)
Code:
awk -F, 'NR==1{
                print "Name," $0;
                R=NF
              }
         NR>1 {
                for(i=1;i<=NF;i++){
                                        A[$i,i]++;
                                        if($i){
                                                C[$i]
                                              }
                                  }
              }
         END  {
                for(i in C)       {
                                        for(j=1;j<=R;j++){
                                                                Q=Q?Q FS (A[i,j]=A[i,j]>=1?1:0):i FS  (A[i,j]=A[i,j]>=1?1:0)};
                                                                print Q;
                                                                Q=""
                                                         }
              }
         ' Input_file   |   awk -F, 'NR==1{
		next
              } 
              {
		for(i=2;i<=NF;i++){
					for(j=i+1;j<=NF;j++){
								if($i==$j && $i!=0 && $j!=0){
												S["Set"(i-1)(j-1)"_common"]++;
											    };
                                                            }
                                  }
              } 
              {
                for(q=2;q<=NF;q++){
					if($q==1)           {
								num=q-1;
								E++
							    }
                                  };
                if(E==1)          {
					Y["Set"num"_unique"]++
				  };
		E=""
              } 
         END  {
		for(i in S){
				print i OFS S[i]
			   }
                for(u in Y){
				print u OFS Y[u]
                           }
              }
         '

Output will be as follows(as per Input_file into your POST#1).
Code:
Set23_common 2
Set13_common 1
Set12_common 2
Set1_unique 2
Set3_unique 2
Set2_unique 1

Please let me know if this helps you, will be glad.

Thanks,
R. Singh
You are a legend R.Singh. Thank you!
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Venn diagram results using awk jacobs.smith Shell Programming and Scripting 15 07-25-2012 06:11 AM
maker flomper Programming 2 09-11-2002 08:52 PM