Remove rows with first 4 fields duplicated in awk Post: 302568782

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk script to remove duplicate rows in line

i have the long file more than one ns and www and mx in the line like . i need the first ns record and first www and first mx from line . the records are seperated with tthe ; i am try ing in awk scripting not getiing the solution. ...

2. Shell Programming and Scripting

Help with remove duplicated content

Input file: hcmv-US25-2-3p hsa-3160-5 hcmv-US33 hsa-47 hcmv-UL70-3p hsa-4508 hcmv-UL70-3p hsa-4486 hcms-US25 hsa-360-5 hcms-US25 hsa-4 hcms-US25 hsa-458 hcms-US25 hsa-44812 . . Desired Output file: hcmv-US25-2-3p hsa-3160-5 hcmv-US33 hsa-47 hcmv-UL70-3p hsa-4508 hsa-4486...

3. Shell Programming and Scripting

awk to grep rows by multiple fields

Hello, I met a challenge to extract part of the table. I'd like to grep the first three matches based on field1 and field2. Input: D A 92.85 1315 83 11 D A 95.90 757 28 3 D A 94.38 480 20 7 D A 91.21 307 21 6 D A 94.26 244 ...

4. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . .

5. Shell Programming and Scripting

Delete duplicated fields in a line

Hi, I have files with this kind of format (separator is space): A1 B1 C1 D1 E1 F1 D1 C1 G1 H1 A2 B2 C2 D2 E2 F2 D2 C2 G2 H2 A3 B3 C3 D3 E3 F3 G3 D3 C3 H3 A4 B4 C4 D4 E4 F4 G4 D4 C4 H4 I want the output to be: A1 B1 E1 F1 G1 H1 A2 B2 E2 F2 G2 H2 A3 B3 E3 F3 G3 H3 A4 B4 E4 F4 G4...

6. Shell Programming and Scripting

Removing duplicated first field rows

Hello, I am trying to eliminate rows where the first field is duplicated, leaving the row where the last field is "NET". Data file: 345234|22.34|LST 546543|55.33|LST 793929|98.23|LST 793929|64.69|NET 149593|49.22|LST Desired output: 345234|22.34|LST 546543|55.33|LST...

7. Shell Programming and Scripting

Merge files and remove duplicated rows

In a folder I'll several times daily receive new files that I want to combine into one big file, without any duplicate rows. The file name in the folder will look like e.q: MissingData_2014-08-25_09-30-18.txt MissingData_2014-08-25_09-30-14.txt MissingData_2014-08-26_09-30-12.txt The content...

8. Shell Programming and Scripting

Remove rows containing commas with awk

Hello everyone, I have a dataset that looks something like: 1 3 2 2 3 4,5 4 3:9 5 5,9 6 5:6 I need to remove the rows that contain a comma in the second column and I'm not sure how to go about this. Here is an attempt. awk 'BEGIN {FS=" "} { if ($2!==,) print }'Any help is appreciated.

9. Shell Programming and Scripting

awk to remove range of fields

I am trying to cut a range of fields in awk. The below seems to work for removing field 50, but what is the correct syntax for removing a range ($50-$62). Thank you :). awk awk 'BEGIN{FS=OFS="\t"}{$50=""; gsub(/\t\t/,"\t")}1' test.vcf.hg19_multianno.txt > output.csv Maybe: awk...

10. Shell Programming and Scripting

awk to remove lines where field count is greather than 1 in two fields

I am trying to remove all the lines and spaces where the count in $4 or $5 is greater than 1 (more than 1 letter). The file and the output are tab-delimited. Thank you :). file X 5811530 . G C NLGN4X 17 10544696 . GA G MYH3 9 96439004 . C ...

LEARN ABOUT BSD

join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [ options ] file1 file2

DESCRIPTION

       Join  forms,  on the standard output, a join of the two relations specified by the lines of file1 and file2.  If file1 is `-', the standard
       input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Fields are normally separated by blank, tab or newline.	In this case, multiple separators count as one, and leading  separators  are  dis-
       carded.

       These options are recognized:

       -an    In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -e s   Replace empty output fields by string s.

       -jn m  Join on the mth field of file n.	If n is missing, use the mth field in each file.

       -o list
	      Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a
	      field number.

       -tc    Use character c as a separator (tab character).  Every appearance of c in a line is significant.

SEE ALSO

       sort(1), comm(1), awk(1)

BUGS

       With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.

       The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.

7th Edition							  April 29, 1985							   JOIN(1)