Search and count a unique string


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Search and count a unique string
# 1  
Old 01-28-2014
Search and count a unique string

Hi Guys,
I have a file as follows. Here is my story:
For each field, the string in the 5th column needs to be searched in other fields of the same column and counted if the 1st column of the field is different from that of the primary field. BTW, the unique strings of 1st column need to be considered. Sorry if this is too complicated. Let me clarify it with this example. Here is my file (tab delimited):
Code:
A1          1          15231          15232          ESR1
A1          1          15235          15236          ESR1
A2          1          15231          15232          ESR1
A3          1          15235          15236          BTW
A4          1          15235          15236          FKH
A5          1          15235          15236          FKH
A6          1          15235          15236          FKH

Now the counts are reported in a new column:
Code:
A1          1          15231          15232          ESR1          2
A1          1          15235          15236          ESR1          2
A2          1          15231          15232          ESR1          2
A3          1          15235          15236          BTW          1
A4          1          15235          15236          FKH          3
A5          1          15235          15236          FKH          3
A6          1          15235          15236          FKH          3

Thanks a lot in advance!

Last edited by a_bahreini; 01-28-2014 at 08:11 PM..
# 2  
Old 01-28-2014
1. Your 1st column in not unique (A1 appears in row #1 and row #2)
2. From my understanding of your requirement, shouldn't the output be:
Code:
A3          1          15235          15236          BTW          0
A4          1          15235          15236          FKH          3
A5          1          15235          15236          FKH          3
A6          1          15235          15236          FKH          3

# 3  
Old 01-28-2014
Yes, the first column is not unique but the search needs to be done on unique strings of the first column. For example, ESR1 is repeating in the first three fields. However, it should be reported two at the end since there are only 2 unique strings in the first column which have that. This should apply to other strings in the 5th column as well. Please let me know if this still doesn't make sense
(my first post was edited)
# 4  
Old 01-28-2014
Read input file twice:
Code:
awk -F'\t' '
        NR == FNR {
                v = $1 FS $5
                if ( ! ( v in A ) )
                        C[$5]++
                A[v]
                next
        }
        {
                print $0 FS C[$5]
        }
' file file

This User Gave Thanks to Yoda For This Post:
# 5  
Old 01-29-2014
Thanks Yoda, it worked

---------- Post updated 01-29-14 at 03:58 PM ---------- Previous update was 01-28-14 at 07:29 PM ----------

Hey Yoda,
Sorry again but I have another issue now. For each line, I want the counts of the lines which have similar values in the 1st and 2nd column. Let's say I have a file like this:
Code:
2	131
2	131
3	131
4	150	
4	160
x	200
x	200

I need it to be reported as follows:
Code:
2	131	2
2	131	2
3	131	1
4	150	1	
4	160	1
x	200	2
x	200	2

I really appreciate if you solve this for me.
# 6  
Old 01-29-2014
Code:
awk 'NR==FNR{A[$1,$2]++;next}{print $0,A[$1,$2]}' file file

# 7  
Old 01-29-2014
Awesome, thanks Yoda!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Count unique column

Hello, I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited My input file looks like this: Colum1 Colum2 Colum3 Colum4 Coulmn5 1.1 100 100 a b 1.1 100 100 a c 1.2 200 205 a d 1.3 300 301 a y 1.3 300... (6 Replies)
Discussion started by: nans
6 Replies

2. UNIX for Beginners Questions & Answers

Count unique words

Dear all, I would like to know how to list and count unique words in thousands number of text files. Please help me out thanks in advance (9 Replies)
Discussion started by: imranrasheedamu
9 Replies

3. Shell Programming and Scripting

Print count of unique values

Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary... (3 Replies)
Discussion started by: H squared
3 Replies

4. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

5. Shell Programming and Scripting

awk to count using each unique value

Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6. CA001011500 11111 11111 -9999 201301 AAA CA001012040 11111 11111 -9999 201301 AAA CA001012573 11111 11111 -9999 201301 BBB CA001012710 11111 11111 -9999 201301... (4 Replies)
Discussion started by: ncwxpanther
4 Replies

6. Shell Programming and Scripting

Search several string and convert into a single line for each search string using awk command AIX?.

I need to search the file using strings "Request Type" , " Request Method" , "Response Type" and by using result set find the xml tags and convert into a single line?. below are the scenarios. Cat test Nov 10, 2012 5:17:53 AM INFO: Request Type Line 1.... (5 Replies)
Discussion started by: laknar
5 Replies

7. Shell Programming and Scripting

Search for a pattern in a String file and count the occurance of each pattern

I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported Input file is a free flowing file without any format example of output ERR-00001=5 .... ERR-01010=10 ..... ERR-99999=10 (4 Replies)
Discussion started by: swayam123
4 Replies

8. Shell Programming and Scripting

get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM". I can find the line number for the beginning of the statement section with sed. ... (5 Replies)
Discussion started by: andrewsc
5 Replies

9. Shell Programming and Scripting

How to count unique strings

How do I count the total number of unique strings from a file using Perl? Any help is appreciated.. (6 Replies)
Discussion started by: my_Perl
6 Replies

10. UNIX for Dummies Questions & Answers

count the number of files which have a search string, but counting the file only once

I need to count the number of files which have a search string, but counting the file only once if search string is found. eg: File1: Please note that there are 2 occurances of "aaa" aaa bbb ccc aaa File2: Please note that there are 3 occurances of "aaa" aaa bbb ccc... (1 Reply)
Discussion started by: sudheshnaiyer
1 Replies
Login or Register to Ask a Question