Hi all,
I have a huge csv file with the following format of data,
Num SNPs, 549997
Total SNPs,555352
Num Samples, 157
SNP, SampleID, Allele1, Allele2
A001,AB1,A,A
A002,AB1,A,A
A003,AB1,A,A
...
...
...
I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Hi All,
I have a file which is having 3 columns as (string string integer)
a b 1
x y 2
p k 5
y y 4
.....
.....
Question:
I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
I have a text file where the second column is a list of numbers going from small to large. I want to extract the rows where the second column is smaller than or equal to 0.0001.
My input:
rs10082730 9e-08 12 46002702
rs2544081 1e-07 12 46015487
rs1425136 1e-06 7 35396742
rs2712590... (1 Reply)
I have a space delimited text file. I want to extract rows where the third column has 0 as a value and write those rows into a new space delimited text file. How do I go about doing that? Thanks! (2 Replies)
I have a file containing data like so:
2012-01-02 GREEN 4
2012-01-02 GREEN 6
2012-01-02 GREEN 7
2012-01-02 BLUE 4
2012-01-02 BLUE 3
2012-01-02 GREEN 4
2012-01-02 RED 4
2012-01-02 RED 8
2012-01-02 GREEN 4
2012-01-02 YELLOW 5
2012-01-02 YELLOW 2
I can't always predict what the... (4 Replies)
Hi all,
I am new to shell script.I need your help to write a shell script.
I need to write a shell script to extract data from a .csv file where columns are ',' separated.
The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
cat sample.csv
ID,Name,no
1,AAA,1
2,BBB,1
3,AAA,1
4,BBB,1
cut -d',' -f2 sample.csv | sort | uniq
this gives only the 2nd column values
Name
AAA
BBB
How to I get all the columns of CSV along with this? (1 Reply)
Hi would like to ask you guys any advise regarding my problem
I have this kind of data
file.txt
111111111,20
111111111,50
222222222,70
333333333,40
444444444,10
444444444,20
I need to get this
file1.txt
111111111,70
222222222,70
333333333,40
444444444,30
using this code I can... (6 Replies)
Hi All,
Does anyone have any suggestions/examples of how i could show only lines where the first field is not duplicated. If the first field is listed more than once it shouldnt be shown even if the other columns make it unique.
Example file :
876,RIBDA,EC2
876,RIBDH,EX7
877,RIBDF,E28... (4 Replies)
What is an efficient way of counting the number of unique values in a 400 column by 1000 row array and outputting the counts per column, assuming the unique values in the array are:
A, B, C, D
In other words the output should look like: Value COL1 COL2 COL3
A 50 51 52... (16 Replies)
Discussion started by: Geneanalyst
16 Replies
LEARN ABOUT NETBSD
uniq
UNIQ(1) BSD General Commands Manual UNIQ(1)NAME
uniq -- report or filter out repeated lines in a file
SYNOPSIS
uniq [-cdu] [-f fields] [-s chars] [input_file [output_file]]
DESCRIPTION
The uniq utility reads the standard input comparing adjacent lines, and writes a copy of each unique input line to the standard output. The
second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are
not adjacent, so it may be necessary to sort the files first.
The following options are available:
-c Precede each output line with the count of the number of times the line occurred in the input, followed by a single space.
-d Don't output lines that are not repeated in the input.
-f fields
Ignore the first fields in each input line when doing comparisons. A field is a string of non-blank characters separated from adja-
cent fields by blanks. Field numbers are one based, i.e. the first field is field one.
-s chars
Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the
first chars characters after the first fields fields will be ignored. Character numbers are one based, i.e. the first character is
character one.
-u Don't output lines that are repeated in the input.
If additional arguments are specified on the command line, the first such argument is used as the name of an input file, the second is used
as the name of an output file.
The uniq utility exits 0 on success, and >0 if an error occurs.
COMPATIBILITY
The historic +number and -number options have been deprecated but are still supported in this implementation.
SEE ALSO sort(1)STANDARDS
The uniq utility is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible.
BSD January 6, 2007 BSD