Grep document according to values

 
Thread Tools Search this Thread
Homework and Emergencies Emergency UNIX and Linux Support Grep document according to values
# 1  
Old 02-19-2014
Grep document according to values

Hi,
I have the following data that is 3-col, tab separated and looks something like this:
Code:
inscription	1	1
ionosphere	0	0
magnate	0	1
majesty	1	0
meritocracy	0	0
monarchy	0	0
monkey	1	0
notepaper	1	1

The first column of the data is an ID, the second column of the data is a prediction score and the third column is the actual score.

I want to organize a confusion matrix with this data; In which those columns 2 and 3 that contain 1 and 1 are considered "TP" (true positive), those columns 2 and 3 that contain 0 and 0 are considered "TN" (true negative), those columns 2 that have a 1 and column 3 that have a 0 are considered "FP" (false positive) and those column 2 that have a 0 and column 3 that have a 1 are considered "FN" (false negative).

Considering the above data the result would be as follows
Code:
TP 2
TN 3
FN 1
FP 2

Is there a grep that can help me to achieve this result?
Thank you very much!
# 2  
Old 02-19-2014
Quote:
Originally Posted by owwow14
Is there a grep that can help me to achieve this result?
No, there isn't: grep is for filtering lines according to some rules, usually a regexp. What grep can do is: return all lines which exhibit a certain pattern. What grep cannot do: summarize content.

Fortunately there are other means of text processing which can indeed deliver what you want (replace <t> with a literal tab in the following). Notice that the script is "barebone", no effort is spent on runtime security, error detection, etc., ...):

Code:
#! /bin/ksh

typeset -i iTP=0
typeset -i iTN=0
typeset -i iFP=0
typeset -i iFN=0
typeset    chTitle=""
typeset -i iPred=0
typeset -i iReal=0

typeset    fIn="/path/to/your/input.file"

while IFS='<t>' read chTitle iPred iReal ; do
     if [ $iPred -eq 0 ] ; then
          if [ $iReal -eq 0 ] ; then
               (( iTN += 1 ))
          else
               (( iFN += 1 ))
          fi
     else
          if [ $iReal -eq 0 ] ; then
               (( iFP += 1 ))
          else
               (( iTP += 1 ))
          fi
     fi
done < "$fIn"

print - "True Positives : $iTP"
print - "True Negatives : $iTN"
print - "False Positives: $iFP"
print - "False Negatives: $iFN"

exit 0

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 3  
Old 02-19-2014
If you do not mind to grep four times, this may could help you.
Code:
TP=`grep -c "1[[:blank:]]\{1,\}1$" infile`
TN=`grep -c "0[[:blank:]]\{0,\}0$" infile`
FN=`grep -c "0[[:blank:]]\{0,\}1$" infile`
FP=`grep -c "1[[:blank:]]\{0,\}0$" infile`
echo TP" $TP";echo TN" $TN";echo FN" $FN";echo FP" $FP"

This User Gave Thanks to Lucas_0418 For This Post:
# 4  
Old 02-19-2014
An awk approach:
Code:
awk -F'\t' '
        {
                TP += ( $2 == 1 && $3 == 1 ) ? 1 : 0
                TN += ( $2 == 0 && $3 == 0 ) ? 1 : 0
                FP += ( $2 == 1 && $3 == 0 ) ? 1 : 0
                FN += ( $2 == 0 && $3 == 1 ) ? 1 : 0
        }
        END {
                print "TP", TP
                print "TN", TN
                print "FP", FP
                print "FN", FN
        }
' file

These 3 Users Gave Thanks to Yoda For This Post:
# 5  
Old 02-23-2014
Another way...
Code:
awk '{a[$2,$3]++} END { printf "TP:%d\nTN:%d\nFP:%d\nFN:%d\n",a[1,1],a[0,0],a[1,0],a[0,1] }' infile

--ahamed
This User Gave Thanks to ahamed101 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to grep for decimal values only in the second column.

Hi, I have a file in which I need to print all the lines that have decimal values in the second column. The below prints all the decimal values from the second column but I need the complete line to be printed. cat hello.out | sed 's/ */:/g' | cut -d : -f 2 | ggrep -Eo "+\.+" Can you... (2 Replies)
Discussion started by: mohtashims
2 Replies

2. UNIX for Dummies Questions & Answers

How to use grep with numerical values?

I'm new to Unix and I have been trying to fix this problem for the past week. How would I use grep to display only certain numbers for a list. For example, if I have this list: Joe senior 4/50 John junior 25/50 Mary junior 41/50 Martha sophomore 2/50 ...How do I get a file... (1 Reply)
Discussion started by: PTcharger
1 Replies

3. UNIX for Dummies Questions & Answers

Grep multiple values

This for i in /dev/disco/*;do lvdisplay $i|grep -i size;done Return me every size of lvm in vg "disco" I want to return me,the size and the name of lvm,how to do this? Thanks (7 Replies)
Discussion started by: Linusolaradm1
7 Replies

4. Shell Programming and Scripting

Grep values from different lines

Hello, I have a log file with many lines and I want to grep pcific values from spcific lines, I'm not sure if it is possible or not Sample 16-11-11 19:54:13:INFO:Connection to device ip 20.10.11.23 took 0 16-11-11 19:54:13:FINE:Sending request. 16-11-11 19:54:13:INFO:Received response from... (3 Replies)
Discussion started by: roby2411
3 Replies

5. Shell Programming and Scripting

Here document inside a here document?

Can we use a here document inside a here document? Something like this ssh user@remotehost << REMOTE sudo vserver vsernamename enter << VSERVER perform actions on vserver. VSERVER REMOTE (6 Replies)
Discussion started by: mnanavati
6 Replies

6. Shell Programming and Scripting

grep distinct values

this is a little more complex than that. I have a text file and I need to find all the distinct words that appear in a line after the word TABLESPACE when I grep for just the word tablespace, I get: how do i parse this a little better so i have a smaller file to read? This is just an... (4 Replies)
Discussion started by: guessingo
4 Replies

7. Shell Programming and Scripting

grep two values together.

Hi... I have a file abc.txt , havin more then 10,000 lines, each field separated by '#'. I want to grep 9914699895 and 999 from abc.txt I am trying cat abc.txt | grep 9914699895 | grep 999 but i am also getting data like 9991111111 or 9991010101 I want to grep "999" exactly and... (1 Reply)
Discussion started by: tushar_tus
1 Replies

8. Shell Programming and Scripting

Grep MS Word document

Hi, I have to read a MS word document to find some strings(expressions) .The reading should be done by paragraph.I have to show the entire paragraph If I find any string/expression in that. Please help me out. Thanks Regards Kris (5 Replies)
Discussion started by: mkris
5 Replies

9. UNIX for Dummies Questions & Answers

grep using ASCII values

machine: HPUX file: a.dat contents: decimal 1 decimal 2 string 1 string 2 ASCII value of 'd': 100. to grep lines that have 'd', I use the following command grep d a.dat My requirement: I should grep for lines that contain 'd'. But I should use ASCII value of 'd' in the command... (1 Reply)
Discussion started by: sriksama
1 Replies

10. Shell Programming and Scripting

grep a list of values

Hi everybody! :) :D :D :) it's great to be here since this is my first post. touch /base/oracle/FRA/XMUT00/RMAN_FLAG touch /base/oracle/FRA/XRLL00/RMAN_FLAG find directory name containing RMAN_FLAG : $ find /base/oracle/FRA -name RMAN_FLAG -print|xargs -n1 dirname |sort -u... (3 Replies)
Discussion started by: jolan_louve
3 Replies
Login or Register to Ask a Question