Matrix with Percentage


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Matrix with Percentage
# 1  
Old 06-13-2016
Matrix with Percentage

Hi ALL,

I have below example
Code:
Code:
INPUT 1 (i/p 1)|INPUT 2 (i/p 2)|OUTPUT (o/p)
Bharat Bazar|Bharat Bazar|True Positive
Binny's Sales|<BLANK>|False Negative
<BLANK>|Binny's|False Positive
<BLANK>|<BLANK>|True Negative
Bharat bazar|Bharat|True Positive
binny's|binny|True Positive

where in the o/p should be either of the 4 based on the scenarios on first 2 columns which is self explainatory.

If the column 1 sub string is present in column 2 then it should be True Positive.

After all of these, It should give me the Percentage of all the scenarios based on the total number of records


My first 2 columns i.e. i/p1 and i/p2 are the inputs and the o/p is the result i'm generating based on i/p1 and i/p2.

i/p can be any of the strings as well as blank, similarly i/p2 can be string, sub string of i/p1 or blank.

o/p column should be generate by us. based on i/p1 and i/p2

<BLANK> is just a blank space, I gave it for meaningful representation, | is the delimiter used.

Last edited by nikhil jain; 06-13-2016 at 10:25 AM..
# 2  
Old 06-13-2016
Is this a homework assignment? Homework and coursework questions can only be posted in this forum in a particular format described in special homework rules.

If this is homework, please repost it in the Homework & Coursework forum in the required format.

If not, please show us a sample of your input. Does it come from a file, is data manually entered, or what is the source? What is the format of the input data? Is it a CSV file using the pipe symbol as the field separator or is it some other format? Is there a heading line in the input?

Exactly what output you are trying to produce from your sample intermediate output (with the percentages you said you want, but did not describe).

What is your definition of a "blank space"? Is it an empty string? Is it a single <space> character? Is it a single character in the current locale's blank character class? Is it a string of one or more <space> characters? Is it a string of one or more characters in the current locale's blank character class? Is it a string of zero or more <space> characters? Is it a string of zero or more characters in the current locale's blank character class?

Please explain the logic used to determine when the output should be TRUE or FALSE and the logic used to determine when the output should be Positive or Negative.

If blank space is an empty string or a single <space>, why isn't the line in your intermediate output:
Code:
<BLANK>|<BLANK>|True Negative

be:
Code:
<BLANK>|<BLANK>|True Positive

instead. Shouldn't identical strings be Positive???

If blank space is a single <space> character, why isn't the line your intermediate output:
Code:
Binny's Sales|<BLANK>|False Negative

be:
Code:
Binny's Sales|<BLANK>|True Positive

instead. Shouldn't the <space> in the second field be considered a substring of the 1st field (matching the space in the middle of the 1st field?

What operating system and shell are you using?

What have you tried to solve this problem on your own?

Last edited by Don Cragun; 06-13-2016 at 05:11 PM.. Reason: Fix typos: missing "be:" between code segments.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 06-14-2016
Don,

It's not home work assignment.

It is a requirement my company is having.

Lemme be more clear with my i/p and o/p

This is the i/p to my script

Code:
Code:
INPUT 1 (i/p 1)|INPUT 2 (i/p 2)
Bharat Bazar|Bharat Bazar
Binny's Sales| 
 |Binny's
 | 
Bharat bazar|Bharat
binny's|binny

I'm expecting my o/p to be as shown below

Code:
Code:
INPUT 1 (i/p 1)|INPUT 2 (i/p 2)|OUTPUT (o/p)
Bharat Bazar|Bharat Bazar|True Positive
Binny's Sales| |False Negative
 |Binny's|False Positive
 | |True Negative
Bharat bazar|Bharat|True Positive
binny's|binny|True Positive


As informed earlier, <BLANK> is just a empty space, I gave it initially. Now i hv replace with just space.

I have written below code for percentage calcualations, but m stuck with the logic for col1 and col2 with the scenarios below

1. if col1 string is similar col2 string or even the substring then it shul be TRUE POSITIVE
2. if col1 string is present and col2 string is not then it shul be false negative
3. if col1 string is not present and col2 string is present then it shul be false positive
4. if col1 string and col2 string not present then true negative

My 3rd o/p column shul be generated based on the i/p of col1 and col2 as shown above

My percentage calculation code is as shown below

Code:
Code:
 for (i=1 ; i<cc; i++) {
      p=(c[i]/NR)*100;
      r=(i == 1) ? "" p : r OFS p;
   }
   print r

# 4  
Old 06-14-2016
I said:
Quote:
Exactly what output you are trying to produce from your sample intermediate output (with the percentages you said you want, but did not describe).
but you decided not to show us what output you want for percentages. The code that you say produces the percentages you want is not at all helpful since the code depends on three variables (cc, the array c, and the variable r) none of which have defined values that you have shown us. If this is awk code and these variables haven't been otherwise defined, this code goes through the for loop zero times and prints an empty line. I assume that you do not want an empty line, but I still have no idea what output you do want (other than that your percentages are to be calculated to two decimal places and that you want to include the line containing the headings in your input file in your calculations).

As I said before, there is not such thing as an empty space. From your latest sample input, we see that your use of the term empty space should be treated as a synonym for single space character, but if that is the case, you still need to explain why a single space in field two is not a subset of field 1 in the line:
Code:
Binny's Sales|

where the 8th character in field 1 is a single space character and field 2 is a single space character.

And, you still have not defined what string is similar means in your statement:
Quote:
1. if col1 string is similar col2 string or even the substring then it shall be TRUE POSITIVE
Please show us the EXACT output you are hoping to get from your script (including the percentages).

Please explain how we determine whether or not two strings are similar.

Please explain why a field that is a single space character is not a subset of a field containing a space character>.

And, please explain why two non-empty strings (each containing a single space) that are identical are classified under your rule #4 instead of being classified under your rule #1.
# 5  
Old 06-16-2016
Don,

Sorry for the confusion created.

Code:
Code:
INPUT 1 (i/p 1)|INPUT 2 (i/p 2)|OUTPUT (o/p)
Bharat Bazar|Bharat Bazar|True Positive
Binny's Sales| |False Negative
 |Binny's|False Positive
 | |True Negative
Bharat bazar|Bharat|True Positive
binny's|binny|True Positive

If col1's any string is present in col2 string then it is true positive

as in example above ,bharat is present in bharat bazar hence it is True Positive similarly binny is present in binny's hence true positive.

Plz ignore my percentage program for time being, If possible can u plz incorporate the % program along with the same.
# 6  
Old 06-16-2016
I urgently encourage you to think twice before posting your original problems and formulate with great care, supplying meaningful details. (Which, BTW, you have been asked before, answering:
Quote:
Thanks a lot , sorry to trouble you folks with my wrong input. will make sure not to repeat again.
Step back for a second, forget all the context that you seem to imply, and look at your post#1 specification like the forum members do, not having any background on your problem. Do you think anybody could precisely understand what you request?
# 7  
Old 06-16-2016
Rudi,

I'm sorry for this, Can you plz help ?

My code is as shown below

Code:
Code:
awk -F "|" '
BEGIN{IGNORECASE=1}
{ if ($2 == $3 && $2 != "" && $3 != "" ) { print $2 "|" $3 "|TRUE POSITIVE"; }
   else if ($3 == "" && $2 != "" && $2 != $3 ){print $2"|"$3"|FALSE NEGATIVE";}
   else if ($2 == "" && $3 != "" ){print $2"|""|FALSE POSITIVE";}
   else if ($2 == "" && $3 == "") { print $2 "|" $3"|TRUE NEGATIVE"; }
   else {print $2 "|" $3 "|TRUE POSITIVE";}
}' $1

When i Execute my above script , I get below o/p which is incorrect.

Because state is not at all substring of country, If they are different completely it should be false positive

Code:
Code:
INPUT 1 (i/p 1)|INPUT 2 (i/p 2)|TRUE POSITIVE
Bharat Bazar|Bharat Bazar|TRUE POSITIVE
Binny's Sales| |TRUE POSITIVE
|Binny's|FALSE POSITIVE
||TRUE NEGATIVE
Bharat bazar|Bharat|TRUE POSITIVE
binny's|binny|TRUE POSITIVE
state|country|TRUE POSITIVE



Let me know if there is any other clarrifications required.

Last edited by nikhil jain; 06-16-2016 at 05:30 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Percentage calculation

Hi, I have a text file in below format. I trying to find a solution for finding percentage used for each of the NAMEs. Directory ALLOCATED USED NAME1 93MB 93KB NAME2 25G 62K NAME3 14G 873M NAME4 25G 62K NAME5 20G... (10 Replies)
Discussion started by: ctrld
10 Replies

2. Shell Programming and Scripting

Calculate percentage of columns greater than certain value in a matrix using awk

This matrix represents correlation values. Is it possible to calculate the percentage of columns (a1, a2, a3) that have a value >= |0.5| and report the percentage that has positive correlation >0.5 and negative correlation <-0.5 separately. thanx in advance! input name a1 a2 a3... (5 Replies)
Discussion started by: quincyjones
5 Replies

3. Shell Programming and Scripting

Need to monitor OS in percentage

Hi, I am looking for generic commands / scripts that could run across platforms especially on HP Itanium boxes to give me % of free OS parameters For eg: Free Total Memory RAM : 20 % Free Total Swap Memory: 35% Free Total CPU utilisation: 44% Free Disk Space: /appl = 55%... (5 Replies)
Discussion started by: mohtashims
5 Replies

4. Shell Programming and Scripting

Percentage of occurence

Dear all, I have data like below and i need to add coloumn before the COUNT field to see the Percentage out of all COUNT field value for respective raw. ============================================= COUNT CODE sConnType tConnType... (6 Replies)
Discussion started by: Iroshan
6 Replies

5. Shell Programming and Scripting

Percentage calculation

i am trying to get percentage : but not able to do it: i tried : x=1 y=2 z=`expr $x/$y*100` it is not giving me result can u pls help on this (4 Replies)
Discussion started by: Aditya.Gurgaon
4 Replies

6. Shell Programming and Scripting

awk? adjacency matrix to adjacency list / correlation matrix to list

Hi everyone I am very new at awk but think that that might be the best strategy for this. I have a matrix very similar to a correlation matrix and in practical terms I need to convert it into a list containing the values from the matrix (one value per line) with the first field of the line (row... (5 Replies)
Discussion started by: stonemonkey
5 Replies

7. Ubuntu

How to convert full data matrix to linearised left data matrix?

Hi all, Is there a way to convert full data matrix to linearised left data matrix? e.g full data matrix Bh1 Bh2 Bh3 Bh4 Bh5 Bh6 Bh7 Bh1 0 0.241058 0.236129 0.244397 0.237479 0.240767 0.245245 Bh2 0.241058 0 0.240594 0.241931 0.241975 ... (8 Replies)
Discussion started by: evoll
8 Replies

8. Shell Programming and Scripting

diagonal matrix to square matrix

Hello, all! I am struggling with a short script to read a diagonal matrix for later retrieval. 1.000 0.234 0.435 0.123 0.012 0.102 0.325 0.412 0.087 0.098 1.000 0.111 0.412 0.115 0.058 0.091 0.190 0.045 0.058 1.000 0.205 0.542 0.335 0.054 0.117 0.203 0.125 1.000 0.587 0.159 0.357... (11 Replies)
Discussion started by: yifangt
11 Replies

9. UNIX for Dummies Questions & Answers

percentage

How to calculate percentage of two values in unix. (5 Replies)
Discussion started by: venkatesht
5 Replies

10. Shell Programming and Scripting

awk percentage

how would you calculate percentage by per line? Given a column of 16 lines, grab each line and divide it by the sum of the entire column and multiply by 100? thanks ... (8 Replies)
Discussion started by: rockiefx
8 Replies
Login or Register to Ask a Question