remove duplicate lines based on two columns and judging from a third one Post: 302555275

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Hi having a file as follows MediaErr.log 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:12:16 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:22:47 84 Server1 Policy1 Schedule1 master1 05/08/2008 03:41:26 84 Server1 Policy1 ...

2. UNIX for Dummies Questions & Answers

Duplicate columns and lines

Hi all, I have a tab-delimited file and want to remove identical lines, i.e. all of line 1,2,4 because the columns are the same as the columns in other lines. Any input is appreciated. abc gi4597 9997 cgcgtgcg $%^&*()()* abc gi4597 9997 cgcgtgcg $%^&*()()* ttt ...

3. Shell Programming and Scripting

Remove duplicate columns in input file

hello, I have an input file which looks like this: 2 C:G 17 -0.14 8.75 33.35 3 G:C 16 -2.28 0.98 28.22 4 C:G 15 0.39 11.06 29.31 5 G:C 14 2.64 5.17 36.07 6 G:C 13 -0.65 2.05 21.94 7 C:G 11 138.96 21.64 14.40 9 C:G 27 -2.40 6.95 27.98 10 C:G 26 2.89 15.60 34.33 11 G:C...

4. Shell Programming and Scripting

Remove duplicate lines based on field and sort

I have a csv file that I would like to remove duplicate lines based on field 1 and sort. I don't care about any of the other fields but I still wanna keep there data intact. I was thinking I could do something like this but I have no idea how to print the full line with this. Please show any method...

5. Shell Programming and Scripting

Remove duplicate based on Group

6. Shell Programming and Scripting

Remove Duplicate by considering multiple columns

hi friends, my input chr1 exon 35204 35266 gene_id "GOLGB1"; transcript_id "GOLGB1"; chr1 exon 42357 42473 gene_id "GOLGB1"; transcript_id "GOLGB1"; chr1 exon 45261 45404 gene_id "GOLGB1"; transcript_id "GOLGB1"; chr1 exon 50701 50778 gene_id "GOLGB1"; transcript_id "GOLGB1";...

7. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Hi All, i have input file like below... CA009156;20091003;M;AWBKCA72;123;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;321;;CANADIAN WESTERN BANK;EDMONTON;;2300, 10303, JASPER AVENUE;;T5J 3X6;; CA009156;20091003;M;AWBKCA72;231;;CANADIAN...

8. Shell Programming and Scripting

How To Remove Duplicate Based on the Value?

Hi , Some time i got duplicated value in my files , bundle_identifier= B Sometext=ABC bundle_identifier= A bundle_unit=500 Sometext123=ABCD bundle_unit=400 i need to check if there is a duplicated values or not if yes , i need to check if the value is A or B when Bundle_Identified ,...

9. Shell Programming and Scripting

Remove columns with duplicate entries

I have a 13gb file. It has the following columns: The 3rd column is basically correlation values. I want to delete those rows which are repeated between the columns: A B 0.04 B C 0.56 B B 1 A A 1 C D 1 C C 1 Desired Output: (preferably in a .csv format A,B,0.04 B,C,0.56 C,D,1...

10. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323...

LEARN ABOUT OPENDARWIN

rs

RS(1)							    BSD General Commands Manual 						     RS(1)

NAME

     rs -- reshape a data array

SYNOPSIS

     rs [-[csCS][x] [kKgGw][N] tTeEnyjhHmz] [rows [cols]]

DESCRIPTION

     The rs utility reads the standard input, interpreting each line as a row of blank-separated entries in an array, transforms the array accord-
     ing to the options, and writes it on the standard output.	With no arguments it transforms stream input into a columnar format convenient for
     terminal viewing.

     The shape of the input array is deduced from the number of lines and the number of columns on the first line.  If that shape is inconvenient,
     a more useful one might be obtained by skipping some of the input with the -k option.  Other options control interpretation of the input col-
     umns.

     The shape of the output array is influenced by the rows and cols specifications, which should be positive integers.  If only one of them is a
     positive integer, rs computes a value for the other which will accommodate all of the data.  When necessary, missing data are supplied in a
     manner specified by the options and surplus data are deleted.  There are options to control presentation of the output columns, including
     transposition of the rows and columns.

     The following options are available:

     -cx     Input columns are delimited by the single character x.  A missing x is taken to be `^I'.

     -sx     Like -c, but maximal strings of x are delimiters.

     -Cx     Output columns are delimited by the single character x.  A missing x is taken to be `^I'.

     -Sx     Like -C, but padded strings of x are delimiters.

     -t      Fill in the rows of the output array using the columns of the input array, that is, transpose the input while honoring any rows and
	     cols specifications.

     -T      Print the pure transpose of the input, ignoring any rows or cols specification.

     -kN     Ignore the first N lines of input.

     -KN     Like -k, but print the ignored lines.

     -gN     The gutter width (inter-column space), normally 2, is taken to be N.

     -GN     The gutter width has N percent of the maximum column width added to it.

     -e      Consider each line of input as an array entry.

     -n      On lines having fewer entries than the first line, use null entries to pad out the line.  Normally, missing entries are taken from
	     the next line of input.

     -y      If there are too few entries to make up the output dimensions, pad the output by recycling the input from the beginning.  Normally,
	     the output is padded with blanks.

     -h      Print the shape of the input array and do nothing else.  The shape is just the number of lines and the number of entries on the first
	     line.

     -H      Like -h, but also print the length of each line.

     -j      Right adjust entries within columns.

     -wN     The width of the display, normally 80, is taken to be the positive integer N.

     -m      Do not trim excess delimiters from the ends of the output array.

     -z      Adapt column widths to fit the largest entries appearing in them.

     With no arguments, rs transposes its input, and assumes one array entry per input line unless the first non-ignored line is longer than the
     display width.  Option letters which take numerical arguments interpret a missing number as zero unless otherwise indicated.

EXAMPLES

     The rs utility can be used as a filter to convert the stream output of certain programs (e.g., spell, du, file, look, nm, who, and wc(1))
     into a convenient ``window'' format, as in

	   % who | rs

     This function has been incorporated into the ls(1) program, though for most programs with similar output rs suffices.

     To convert stream input into vector output and back again, use

	   % rs 1 0 | rs 0 1

     A 10 by 10 array of random numbers from 1 to 100 and its transpose can be generated with

	   % jot -r 100 | rs 10 10 | tee array | rs -T > tarray

     In the editor vi(1), a file consisting of a multi-line vector with 9 elements per line can undergo insertions and deletions, and then be
     neatly reshaped into 9 columns with

	   :1,$!rs 0 9

     Finally, to sort a database by the first line of each 4-line field, try

	   % rs -eC 0 4 | sort | rs -c 0 1

SEE ALSO

     jot(1), pr(1), sort(1), vi(1)

BUGS

     Handles only two dimensional arrays.

     The algorithm currently reads the whole file into memory, so files that do not fit in memory will not be reshaped.

     Fields cannot be defined yet on character positions.

     Re-ordering of columns is not yet possible.

     There are too many options.

BSD
								 December 30, 1993							       BSD

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Discussion started by: karthikn7974

2. UNIX for Dummies Questions & Answers

Duplicate columns and lines

Discussion started by: dr_sabz

3. Shell Programming and Scripting

Remove duplicate columns in input file

Discussion started by: linux_usr

4. Shell Programming and Scripting

Remove duplicate lines based on field and sort

Discussion started by: cokedude

5. Shell Programming and Scripting

Remove duplicate based on Group

Discussion started by: yale_work

6. Shell Programming and Scripting

Remove Duplicate by considering multiple columns

Discussion started by: jacobs.smith

7. Shell Programming and Scripting

Remove duplicate value based on two field $4 and $5

Discussion started by: mohan sharma