Search and count a unique string Post: 302885895

Sponsored Content

Top Forums UNIX for Dummies Questions & Answers Search and count a unique string Post 302885895 by a_bahreini on Tuesday 28th of January 2014 06:33:19 PM

01-28-2014

Registered User

Search and count a unique string

Hi Guys,
I have a file as follows. Here is my story:
For each field, the string in the 5th column needs to be searched in other fields of the same column and counted if the 1st column of the field is different from that of the primary field. BTW, the unique strings of 1st column need to be considered. Sorry if this is too complicated. Let me clarify it with this example. Here is my file (tab delimited):

Code:

A1          1          15231          15232          ESR1
A1          1          15235          15236          ESR1
A2          1          15231          15232          ESR1
A3          1          15235          15236          BTW
A4          1          15235          15236          FKH
A5          1          15235          15236          FKH
A6          1          15235          15236          FKH

Now the counts are reported in a new column:

Code:

A1          1          15231          15232          ESR1          2
A1          1          15235          15236          ESR1          2
A2          1          15231          15232          ESR1          2
A3          1          15235          15236          BTW          1
A4          1          15235          15236          FKH          3
A5          1          15235          15236          FKH          3
A6          1          15235          15236          FKH          3

Thanks a lot in advance!

Last edited by a_bahreini; 01-28-2014 at 08:11 PM..

a_bahreini

View Public Profile for a_bahreini

Find all posts by a_bahreini

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

count the number of files which have a search string, but counting the file only once

I need to count the number of files which have a search string, but counting the file only once if search string is found. eg: File1: Please note that there are 2 occurances of "aaa" aaa bbb ccc aaa File2: Please note that there are 3 occurances of "aaa" aaa bbb ccc...

2. Shell Programming and Scripting

How to count unique strings

How do I count the total number of unique strings from a file using Perl? Any help is appreciated..

3. Shell Programming and Scripting

get part of file with unique & non-unique string

I have an archive file that holds a batch of statements. I would like to be able to extract a certain statement based on the unique customer # (ie. 123456). The end for each statement is noted by "ENDSTM". I can find the line number for the beginning of the statement section with sed. ...

4. Shell Programming and Scripting

Search for a pattern in a String file and count the occurance of each pattern

I am trying to search a file for a patterns ERR- in a file and return a count for each of the error reported Input file is a free flowing file without any format example of output ERR-00001=5 .... ERR-01010=10 ..... ERR-99999=10

5. Shell Programming and Scripting

Search several string and convert into a single line for each search string using awk command AIX?.

I need to search the file using strings "Request Type" , " Request Method" , "Response Type" and by using result set find the xml tags and convert into a single line?. below are the scenarios. Cat test Nov 10, 2012 5:17:53 AM INFO: Request Type Line 1....

6. Shell Programming and Scripting

awk to count using each unique value

Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6. CA001011500 11111 11111 -9999 201301 AAA CA001012040 11111 11111 -9999 201301 AAA CA001012573 11111 11111 -9999 201301 BBB CA001012710 11111 11111 -9999 201301...

7. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are...

8. Shell Programming and Scripting

Print count of unique values

Hello experts, I am converting a number into its binary output as : read n echo "obase=2;$n" | bc I wish to count the maximum continuous occurrences of the digit 1. Example : 1. The binary equivalent of 5 = 101. Hence the output must be 1. 2. The binary...

9. UNIX for Beginners Questions & Answers

Count unique words

Dear all, I would like to know how to list and count unique words in thousands number of text files. Please help me out thanks in advance

10. UNIX for Beginners Questions & Answers

Count unique column

Hello, I am trying to count unique rows in my file based on 4 columns (2-5) and to output its frequency in a sixth column. My file is tab delimited My input file looks like this: Colum1 Colum2 Colum3 Colum4 Coulmn5 1.1 100 100 a b 1.1 100 100 a c 1.2 200 205 a d 1.3 300 301 a y 1.3 300...

LEARN ABOUT DEBIAN

fastx_quality_stats

FASTX_QUALITY_STATS(1)						   User Commands					    FASTX_QUALITY_STATS(1)

NAME

       fastx_quality_stats - FASTX Statistics

DESCRIPTION

       usage: fastx_quality_stats [-h] [-N] [-i INFILE] [-o OUTFILE] Part of FASTX Toolkit 0.0.13.2 by A. Gordon (gordon@cshl.edu)

	      [-h]  =  This  helpful help screen.  [-i INFILE]	= FASTQ input file. default is STDIN.  [-o OUTFILE] = TEXT output file. default is
	      STDOUT.  [-N]	    = New output format (with more information per nucleotide/cycle).

   The *OLD* output TEXT file will have the following fields (one row per column):
       column = column number (1 to 36 for a 36-cycles read solexa file)

       count  = number of bases found in this column.

       min    = Lowest quality score value found in this column.

       max    = Highest quality score value found in this column.

       sum    = Sum of quality score values for this column.

       mean   = Mean quality score value for this column.

       Q1     = 1st quartile quality score.

       med    = Median quality score.

       Q3     = 3rd quartile quality score.

       IQR    = Inter-Quartile range (Q3-Q1).

       lW     = 'Left-Whisker' value (for boxplotting).

       rW     = 'Right-Whisker' value (for boxplotting).

	      A_Count = Count of 'A' nucleotides found in this column.	C_Count = Count of 'C' nucleotides found in this column.  G_Count =  Count
	      of  'G'  nucleotides found in this column.  T_Count = Count of 'T' nucleotides found in this column.  N_Count = Count of 'N' nucleo-
	      tides found in this column.  max-count = max. number of bases (in all cycles)

   The *NEW* output format:
	      cycle (previously called 'column') = cycle number max-count For each nucleotide in the cycle (ALL/A/C/G/T/N):

       count  = number of bases found in this column.

       min    = Lowest quality score value found in this column.

       max    = Highest quality score value found in this column.

       sum    = Sum of quality score values for this column.

       mean   = Mean quality score value for this column.

       Q1     = 1st quartile quality score.

       med    = Median quality score.

       Q3     = 3rd quartile quality score.

       IQR    = Inter-Quartile range (Q3-Q1).

       lW     = 'Left-Whisker' value (for boxplotting).

       rW     = 'Right-Whisker' value (for boxplotting).

SEE ALSO

       The quality of this automatically generated manpage might be insufficient.  It is suggested to visit

	      http://hannonlab.cshl.edu/fastx_toolkit/commandline.html

       to get a better layout as well as an overview about connected FASTX tools.

fastx_quality_stats 0.0.13.2					     May 2012						    FASTX_QUALITY_STATS(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

count the number of files which have a search string, but counting the file only once

Discussion started by: sudheshnaiyer

2. Shell Programming and Scripting

How to count unique strings

Discussion started by: my_Perl

3. Shell Programming and Scripting

get part of file with unique & non-unique string

Discussion started by: andrewsc

4. Shell Programming and Scripting

Search for a pattern in a String file and count the occurance of each pattern

Discussion started by: swayam123