Sponsored Content
Top Forums Shell Programming and Scripting Count common elements in a column Post 302879579 by zozoo on Friday 13th of December 2013 08:03:54 AM
Old 12-13-2013
Code:
awk '{ print $1}' <filename>| uniq -c

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

find common elements in 2 files (for loop)

Hi, i'm new here (and to scripting too). I was hoping for some help in comparing two files. i have a file called 'file1' with a list of names in the following format: adam jones paul higgins kelly lowe i also have another file which may contain some of the names but with a lot of... (4 Replies)
Discussion started by: ibking
4 Replies

2. UNIX for Dummies Questions & Answers

Average for repeated elements in a column

I have a file that looks like this 452 025_E3 8 025_E3 82 025_F5 135 025_F5 5 025_F5 23 025_G2 38 025_G2 71 025_G2 9 026_A12 81 026_A12 10 026_A12 some of the elements in column2 are repeated. I want an output file that will extract the... (1 Reply)
Discussion started by: FelipeAd
1 Replies

3. Shell Programming and Scripting

Filtering lines for column elements based on corresponding counts in another column

Hi, I have a file like this ACC 2 2 21 aaa AC 443 3 22 aaa GCT 76 1 33 xxx TCG 34 2 33 aaa ACGT 33 1 22 ggg TTC 99 3 44 wee CCA 33 2 33 ggg AAC 1 3 55 ddd TTG 10 1 22 ddd TTGC 98 3 22 ddd GCT 23 1 21 sds GTC 23 4 32 sds ACGT 32 2 33 vvv CGT 11 2 33 eee CCC 87 2 44... (1 Reply)
Discussion started by: polsum
1 Replies

4. Shell Programming and Scripting

Count and merge using common column

I have the following records from multiple files. 415 A G 415 A G 415 A T 415 A . 415 A . 421 G A 421 G A,C 421 G A 421 G A 421 G A,C 421 G . 427 A C 427 A ... (3 Replies)
Discussion started by: empyrean
3 Replies

5. UNIX for Dummies Questions & Answers

Merging tables: identifiying common and unique elements

Hi all, I know how to merge two tables and to remove the duplicated lines based on a field (Column 2) . My next challenge is to be able to identify in a new column those common elements between table A & B, those elements in table A not present in table B and vice versa. A simple count would be... (6 Replies)
Discussion started by: lsantome
6 Replies

6. Shell Programming and Scripting

Getting the most common column with respect another

hi all, i want to get the most comon column w.r.t another this is my file Tom|london Tom|london Tom|Paris Adam|Madrid Adam|NY the Output to get me : Tom|london Adamn|Madrid ive tried (10 Replies)
Discussion started by: teefa
10 Replies

7. Shell Programming and Scripting

Matching column and search closest elements

Hi all I have a great challenge that I am not able to resolve. Briefly, I have a file like this: ID_1 chr1 100 - ID_2 chr2 300 + and another file like this: name_1 chr1 150 no - name_2 chr1 250 yes - name_3 chr2 350 yes + name_4 chr2 280 yes + Well, for each entry in file1 I would... (2 Replies)
Discussion started by: giuliangiuseppe
2 Replies

8. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

9. UNIX for Beginners Questions & Answers

Awk: count unique elements in a field and sum their occurence across the entire file

Hi, Sure it's an easy one, but it drives me insane. input ("|" separated): 1|A,B,C,A 2|A,D,D 3|A,B,B I would like to count the occurence of each capital letters in $2 across the entire file, knowing that duplicates in each record count as 1. I am trying to get this output... (5 Replies)
Discussion started by: beca123456
5 Replies

10. UNIX for Beginners Questions & Answers

Add column and multiply its result to all elements of another column

Input file is as follows: 1 | 6 2 | 7 3 | 8 4 | 9 5 | 10 Output reuired (sum of the first column $1*$2) 1 | 6 | 90 2 | 7 | 105 3 | 8 | 120 4 |9 | 135 5 |10 | 150 Please enclose sample input, sample output, and code... (5 Replies)
Discussion started by: Sagar Singh
5 Replies
FASTX_QUALITY_STATS(1)						   User Commands					    FASTX_QUALITY_STATS(1)

NAME
fastx_quality_stats - FASTX Statistics DESCRIPTION
usage: fastx_quality_stats [-h] [-N] [-i INFILE] [-o OUTFILE] Part of FASTX Toolkit 0.0.13.2 by A. Gordon (gordon@cshl.edu) [-h] = This helpful help screen. [-i INFILE] = FASTQ input file. default is STDIN. [-o OUTFILE] = TEXT output file. default is STDOUT. [-N] = New output format (with more information per nucleotide/cycle). The *OLD* output TEXT file will have the following fields (one row per column): column = column number (1 to 36 for a 36-cycles read solexa file) count = number of bases found in this column. min = Lowest quality score value found in this column. max = Highest quality score value found in this column. sum = Sum of quality score values for this column. mean = Mean quality score value for this column. Q1 = 1st quartile quality score. med = Median quality score. Q3 = 3rd quartile quality score. IQR = Inter-Quartile range (Q3-Q1). lW = 'Left-Whisker' value (for boxplotting). rW = 'Right-Whisker' value (for boxplotting). A_Count = Count of 'A' nucleotides found in this column. C_Count = Count of 'C' nucleotides found in this column. G_Count = Count of 'G' nucleotides found in this column. T_Count = Count of 'T' nucleotides found in this column. N_Count = Count of 'N' nucleo- tides found in this column. max-count = max. number of bases (in all cycles) The *NEW* output format: cycle (previously called 'column') = cycle number max-count For each nucleotide in the cycle (ALL/A/C/G/T/N): count = number of bases found in this column. min = Lowest quality score value found in this column. max = Highest quality score value found in this column. sum = Sum of quality score values for this column. mean = Mean quality score value for this column. Q1 = 1st quartile quality score. med = Median quality score. Q3 = 3rd quartile quality score. IQR = Inter-Quartile range (Q3-Q1). lW = 'Left-Whisker' value (for boxplotting). rW = 'Right-Whisker' value (for boxplotting). SEE ALSO
The quality of this automatically generated manpage might be insufficient. It is suggested to visit http://hannonlab.cshl.edu/fastx_toolkit/commandline.html to get a better layout as well as an overview about connected FASTX tools. fastx_quality_stats 0.0.13.2 May 2012 FASTX_QUALITY_STATS(1)
All times are GMT -4. The time now is 10:38 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy