Sponsored Content
Full Discussion: Uniq count second column
Top Forums Shell Programming and Scripting Uniq count second column Post 302939149 by Wan Fahmi on Monday 23rd of March 2015 07:49:52 AM
Old 03-23-2015
Uniq count second column

Hello

How can I get a number of occurrence count for this file;

Code:
ERR315389.1000156       CTTGAAGAAGAATTGAAAACTGTGACGAACAACTTGAAGTCACTGGAGGCTCAGGCTGAGAAGTACTCGCAGAAGGAAGACAGATATGAGGAAGAG
ERR315389.1000281       GCGTCTGGCAACAGCTTTGCAGAAGCTGGAGGAAGCTGAGAAGGCAGCAGATGAGAGTGAGAGAGGCATGAAAGTCATTGAGAGTCGAGCCCAAAA
ERR315389.1000504       GGTCATCATTGAGAGCGACCTGGAACGTGCAGAGGAGCGGGCTGAGCTCTCAGAAGGCAAATGTGCCGAGCTTGAAGAAGAATTGAAAACTGTGAC
ERR315389.1000637       GCTGGTGTCACTGCAAAAGAAACTCAAGGGCACCGAAGATGAACTGGACAAATACTCTGAGGCTCTCAAAGATGCCCAGGAGAAGCTGGAGCTGGC
ERR315389.1000647       CGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGAGAACGCCTT
ERR315389.1000762       AAAGCATTGATGACTTAGAAGACGAGCTGTACGCTCAGAAACTGAAGTACAAAGCCATCAGCGAGGAGCTGGACCACGCTCTCAACGATATGACTT
ERR315389.1000854       AGGAGATCCAACTGAAAGAGGCAAAGCACATTGCTGAAGATGCCGACCGCAAATATGAAGAGGTGGCCCGTAAGCTGGTCATCATTGAGAGCGACC
ERR315389.1001141       AAAAAGGCCACCGATGCTGAAGCCGACGTAGCTTCTCTGAACAGACGCATCCAGCTGGTTGAGGAAGAGTTGGATCGTGCCCAGGAGCGTCTGGCA
ERR315389.1001145       GCAGAAGCTGGAGGAAGCTGAGAAGGCAGCAGATGAGAGTGAGAGAGGCATGAAAGTCATTGAGAGTCGAGCCCAAAAAGATGAAGAAAAAATGGA
ERR315389.1001393       CAGCTTTGCAGAAGCTGGAGGAAGCTGAGAAGGCAGCAGATGAGAGTGAGAGAGGCATGAAAGTCATTGAGAGTCGAGCCCAAAAAGATGAAGAAA

I tried cat file1 | uniq -cf1 > file2 count the occurrence for the first column but it end up with the count for 1st column.

Code:
11 ERR315389.1502254       CTCCGCCCGACCGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGA
     12 ERR315389.6544981       NTCCGCCCGACCGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGA
     24 ERR315389.4012310       CCGACCGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCG
     24 ERR315389.5696434       CGACCGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGA
     36 ERR315389.456083        CCGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAA
     12 ERR315389.894063        CGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAG
     12 ERR315389.1554704       CTCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAG
     24 ERR315389.5277557       CGCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAG
     60 ERR315389.2681352       GCGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGG
    144 ERR315389.452044        CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA

How can I get the count based second column and ignore which name from the first column they take. Te first column will be an arbitrary name for the second column.

For instance this raw file

Code:
ERR315389.1451218       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.1640056       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.3946553       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.4137809       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.452044        CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.4597314       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.4896643       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.5450210       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.6159786       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA
ERR315389.7443074       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA

and the desired file which count the number of occurrence from 2nd column is here

Code:
10 ERR315389.1451218       CGCGCTCGCCCCGCCGCTCCTGCTGCAGCCCCAGGGCCCCTCGCCGCCGCCACCATGGACGCCATCAAGAAGAAGATGCAGATGCTGAAGCTCGACAAGGA


Thank you

Last edited by Wan Fahmi; 03-23-2015 at 08:57 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Column sum group by uniq records

Dear All, I want to get help for below case. I have a file like this. saman 1 gihan 2 saman 4 ravi 1 ravi 2 so i want to get the result, saman 5 gihan 2 ravi 3 like this. Pls help me. (17 Replies)
Discussion started by: Nayanajith
17 Replies

2. UNIX for Dummies Questions & Answers

deleteing duplicate lines sing uniq while ignoring a column

I have a data set that has 4 columns, I want to know if I can delete duplicate lines while ignoring one of the columns, for example 10 chr1 ASF 30 15 chr1 ASF 20 5 chr1 ASF 30 6 chr2 EBC 15 4 chr2 EBC 30 ... I want to know if I can delete duplicate lines while ignoring column 1, so the... (5 Replies)
Discussion started by: japaneseguitars
5 Replies

3. Shell Programming and Scripting

Uniq sorting and count

Hi Unix gurus, I have a requirement where I need to find the file count based on unique file names. OPEN_INV_MMDDYYYY_HHMM.xls OPEN_INV_MMDDYYYY_HHMM.xls OPEN_INV_MMDDYYYY_HHMM.xls CLOSE_INV_MMDDYYYY_HHMM.xls CLOSE_INV_MMDDYYYY_HHMM.xls OPEN_INV_MMDDYYYY_HHMM.txt... (2 Replies)
Discussion started by: shankar1dada
2 Replies

4. UNIX for Dummies Questions & Answers

Re: How To Use UNIQ UNIX Command On single Column

Hi , Can You Please let Know How use unix uniq command on a single column for deleting records from file with Below Structure.Pipe Delimter File . Source Name | Account_Id A | 101 B... (2 Replies)
Discussion started by: anudeepkumar123
2 Replies

5. Shell Programming and Scripting

awk - getting uniq count on multiple col

Hi My file have 7 column, FIle is pipe delimed Col1|Col2|col3|Col4|col5|Col6|Col7 I want to find out uniq record count on col3, col4 and col2 ( same order) how can I achieve it. ex 1|3|A|V|C|1|1 1|3|A|V|C|1|1 1|4|A|V|C|1|1 Output should be FREQ|A|V|3|2 FREQ|A|V|4|1 Here... (5 Replies)
Discussion started by: sanranad
5 Replies

6. Shell Programming and Scripting

awk uniq and longest string of a column as index

I met a challenge to filter ~70 millions of sequence rows and I want using awk with conditions: 1) longest string of each pattern in column 2, ignore any sub-string, as the index; 2) all the unique patterns after 1); 3) print the whole row; input: 1 ABCDEFGHI longest_sequence1 2 ABCDEFGH... (12 Replies)
Discussion started by: yifangt
12 Replies

7. Shell Programming and Scripting

Bring values in the second column into single line (comma sep) for uniq value in the first column

I want to bring values in the second column into single line for uniq value in the first column. My input jvm01, Web 2.0 Feature Pack Library jvm01, IBM WebSphere JAX-RS jvm01, Custom01 Shared Library jvm02, Web 2.0 Feature Pack Library jvm02, IBM WebSphere JAX-RS jvm03, Web 2.0 Feature... (10 Replies)
Discussion started by: kchinnam
10 Replies

8. Shell Programming and Scripting

HELP - uniq values per column

Hi All, I am trying to output uniq values per column. see file below. can you please assist? Thank you in advance. cat names joe allen ibm joe smith ibm joe allen google joe smith google rachel allen google desired output is: joe allen google rachel smith ibm (5 Replies)
Discussion started by: Apollo
5 Replies

9. UNIX for Beginners Questions & Answers

Get first column value uniq

Hi All, I have a directory and sub-directory that having ‘n' number of .log file in nearly 1GB. The file is comma separated file. I need to recursively grep and uniq first column values only. I did in perl. But i wish to know more command line utilities to calculate the time for grep and... (4 Replies)
Discussion started by: k_manimuthu
4 Replies

10. Shell Programming and Scripting

Need help in awk: running a loop with one column and segregate data 4 each uniq value in that field

Hi All, I have a file like this(having 2 column). Column 1: like a,b,c.... Column 2: having numbers. I want to segregate those numbers based on column 1. Example: file. a 5 b 9 b 620 a 710 b 230 a 330 b 1910 (4 Replies)
Discussion started by: Raza Ali
4 Replies
All times are GMT -4. The time now is 09:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy