Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Find duplicated values in two columns out of three Post 302724747 by kush on Thursday 1st of November 2012 09:01:52 AM
Old 11-01-2012
Find duplicated values in two columns out of three

hi!
could u help in the following? I have the data (long list!) that looks like (three coumns white space separated):
Code:
rs3094315 0.0665173 742429
rs12562034 0.0738998 758311
rs3934834 0.396449 995669
rs9442372 0.402693 1008567
rs3737728 0.406271 1011278
rs6687776 0.435429 1020428
rs9651273 0.435896 1021403
rs4970405 0.440268 1038818

And i know that values in the first column are unique, whereas in the second in the third there are duplicates. In other words two different "rs" may correspond to same values in the 2nd and 3rd columns. I need to find the duplicates in 2 and 3 columns and then remove whole line that will contain one unique rs and duplicated values in 2 and 3 coulumns.
Thank u in advance! kush

Last edited by Scrutinizer; 11-01-2012 at 10:13 AM.. Reason: code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicated columns

hi all, i have a file contain multicolumns, this file is sorted by col2 and col3. i want to remove the duplicated columns if the col2 and col3 are the same in another line. example fileA AA BB CC DD CC XX CC DD BB CC ZZ FF DD FF HH HH the output is AA BB CC DD BB CC ZZ FF... (6 Replies)
Discussion started by: kamel.seg
6 Replies

2. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies

3. Shell Programming and Scripting

using sed to get rid of duplicated columns...

I can not figure out this one, so I turn to unix.com for help, I have a file, in which there are some lines containing continuously duplicate columns, like the following adb abc abc asd adfj 123 123 123 345 234 444 444 444 444 444 23 and the output I want is adb abc asd adfj 123 345... (5 Replies)
Discussion started by: fedora
5 Replies

4. Shell Programming and Scripting

Shell Script - find, recursively, all files that are duplicated

Hi. I have a problem that i can't seem to resolve. I need to create a script that list all the files, that are found recursively, with the same name. For example if a file exists in more than one directory with the same name it list all the files that he founds with all the info. Could someone... (5 Replies)
Discussion started by: KitFisto
5 Replies

5. Shell Programming and Scripting

Get values from different columns from file2 when match values of file1

Hi everyone, I have file1 and file2 comma separated both. file1 is: Header1,Header2,Header3,Header4,Header5,Header6,Header7,Header8,Header9,Header10 Code7,,,,,,,,, Code5,,,,,,,,, Code3,,,,,,,,, Code9,,,,,,,,, Code2,,,,,,,,,file2... (17 Replies)
Discussion started by: cgkmal
17 Replies

6. UNIX for Dummies Questions & Answers

Removing columns from a text file that do not have any values in second and third columns

I have a text file that has three columns. But at the end of the text file, there are trailing lines that have missing second and third columns: 4 0.04972604 KLHL28 4 0.0497332 CSTB 4 0.04979822 AIF1 4 0.04983331 DECR2 4 0.04990344 KATNB1 4 4 4 4 How can I remove the trailing... (3 Replies)
Discussion started by: evelibertine
3 Replies

7. Shell Programming and Scripting

Adding columns with values dependent on existing columns

Hello I have a file as below chr1 start ref alt code1 code2 chr1 18884 C CAAAA 2 0 chr1 135419 TATACA T 2 0 chr1 332045 T TTG 0 2 chr1 453838 T TAC 2 0 chr1 567652 T TG 1 0 chr1 602541 ... (2 Replies)
Discussion started by: plumb_r
2 Replies

8. UNIX for Dummies Questions & Answers

Find Null values in Columns and fail execution by displaying error message

Hi All, I am new to shell scripting. I have a requirement as part of my job to find out null/empty values in column 2 and column 3 from a CSV file and exit the further execution of script by displaying a simple error message. I have developed a script to do this by reading various articles... (7 Replies)
Discussion started by: tpk
7 Replies

9. Shell Programming and Scripting

How to delete 'duplicated' column values and make a delimited file too?

Hi, I have the following output from an Oracle SQL statement and I want to remove duplicated column values. I know it is possible using Oracle analytical/statistical functions but unfortunately I don't know how to use any of those. So now, I've gone to PLAN B using awk/sed maybe or any... (5 Replies)
Discussion started by: newbie_01
5 Replies

10. Shell Programming and Scripting

awk script to append suffix to column when column has duplicated values

Please help me to get required output for both scenario 1 and scenario 2 and need separate code for both scenario 1 and scenario 2 Scenario 1 i need to do below changes only when column1 is CR and column3 has duplicates rows/values. This inputfile can contain 100 of this duplicated rows of... (1 Reply)
Discussion started by: as7951
1 Replies
MINMAX(l)																 MINMAX(l)

NAME
minmax - Find extreme values in data tables SYNOPSIS
minmax [ files] [ -C ] [ -D ] [ -H[nrec] ] [ -Idx[/dy] ] [ -L ] [ -M[flag] ] [ -: ] [ -bi[s][n] ] DESCRIPTION
minmax reads its standard input [or from files] and finds the extreme values in each of the columns. It recognizes NaNs and will print warnings if the number of columns vary from record to record. As an option, minmax will find the extent of the first two columns rounded up and down to the nearest multiple of dx/dy. This output will be in the form -Rw/e/s/n which can be used directly in the command line for other programs, or simply in column form. xyzfile ASCII [or binary, see -b] file(s) holding a fixed number of data columns. OPTIONS
-C Report the min/max values per column in separate columns [Default uses <min/max> format] -D Sets longitude discontinuity to the Dateline (-180/+180) [Default is Greenwich (0-360)]. Requires -L. -H Input file(s) has Header record(s). Number of header records can be changed by editing your .gmtdefaults file. If used, GMT default is 1 header record. -I Report the min/max of the first two columns to the nearest multiple of dx and dy, and output this in the form -Rw/e/s/n (unless -C is set). -L Indicates that the x column contains longitudes, which may be periodic in 360 degrees [Default assumes no periodicity]. -M Multiple segment file(s). Segments are separated by a special record. For ASCII files the first character must be flag [Default is '>']. For binary files all fields must be NaN. -: Toggles between (longitude,latitude) and (latitude,longitude) input/output. [Default is (longitude,latitude)]. Applies to geo- graphic coordinates only. Only works when -I is selected. -bi Selects binary input. Append s for single precision [Default is double]. Append n for the number of columns in the binary file(s). [Default is 2 input columns]. EXAMPLES
To find the extreme values in the file ship_gravity.xygd, try minmax ship_gravity.xygd Output should look like ship_gravity.xygd: N = 6992 <326.125/334.684> <-28.0711/-8.6837> <-47.7/177.6> <0.6/3544.9> To find the extreme values in the file track.xy to the nearest 5 units and use this region to draw a line using psxy, try psxy `minmax -I5 track.xy` track.xy -Jx1 -B5 -P > track.ps To find the min and max values for each column, but rounded to integers, try minmax junkfile -C -I1 SEE ALSO
gmt(1gmt) 1 Jan 2004 MINMAX(l)
All times are GMT -4. The time now is 01:03 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy