Sponsored Content
Top Forums Shell Programming and Scripting Remove rows with first 4 fields duplicated in awk Post 302568727 by tomahawk on Friday 28th of October 2011 05:22:20 AM
Old 10-28-2011
Remove rows with first 4 fields duplicated in awk

Hi,

I am trying to use awk to remove all rows where the first 4 fields are duplicates. e.g. in the following data lines 6-9 would be removed, leaving one copy of the duplicated row (row 5)

Code:
Borgarhraun    FH9822    ol24    FH9822_ol24_m20    ol    Deformed    c
Borgarhraun    FH9822    ol24    FH9822_ol24_r21            ol    Deformed    r
Borgarhraun    FH9822    ol25    FH9822_ol25_m22    ol    Res. B    c
Borgarhraun    FH9822    ol25    FH9822_ol25_r23            ol    Res. B    r
Borgarhraun    FH9822    ol24    FH9822_ol24_profCD    ol    Deformed    c
Borgarhraun    FH9822    ol24    FH9822_ol24_profCD    ol    Deformed    c
Borgarhraun    FH9822    ol24    FH9822_ol24_profCD    ol    Deformed    c
Borgarhraun    FH9822    ol24    FH9822_ol24_profCD    ol    Deformed    c
Borgarhraun    FH9822    ol24    FH9822_ol24_profCD    ol    Deformed    c
Borgarhraun    FH9822    ol35    FH9822_ol35_m24    ol    Res. B    c


so the output would hopefully look like

Code:
Borgarhraun    FH9822    ol24    FH9822_ol24_m20    ol    Deformed    c
Borgarhraun    FH9822    ol24    FH9822_ol24_r21            ol    Deformed    r
Borgarhraun    FH9822    ol25    FH9822_ol25_m22    ol    Res. B    c
Borgarhraun    FH9822    ol25    FH9822_ol25_r23            ol    Res. B    r
Borgarhraun    FH9822    ol24    FH9822_ol24_profCD    ol    Deformed    c
Borgarhraun    FH9822    ol35    FH9822_ol35_m24    ol    Res. B    c

Can anyone help? Thanks

Last edited by radoulov; 10-28-2011 at 06:45 AM.. Reason: Code tags!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk script to remove duplicate rows in line

i have the long file more than one ns and www and mx in the line like . i need the first ns record and first www and first mx from line . the records are seperated with tthe ; i am try ing in awk scripting not getiing the solution. ... (4 Replies)
Discussion started by: kiranmosarla
4 Replies

2. Shell Programming and Scripting

Help with remove duplicated content

Input file: hcmv-US25-2-3p hsa-3160-5 hcmv-US33 hsa-47 hcmv-UL70-3p hsa-4508 hcmv-UL70-3p hsa-4486 hcms-US25 hsa-360-5 hcms-US25 hsa-4 hcms-US25 hsa-458 hcms-US25 hsa-44812 . . Desired Output file: hcmv-US25-2-3p hsa-3160-5 hcmv-US33 hsa-47 hcmv-UL70-3p hsa-4508 hsa-4486... (3 Replies)
Discussion started by: perl_beginner
3 Replies

3. Shell Programming and Scripting

awk to grep rows by multiple fields

Hello, I met a challenge to extract part of the table. I'd like to grep the first three matches based on field1 and field2. Input: D A 92.85 1315 83 11 D A 95.90 757 28 3 D A 94.38 480 20 7 D A 91.21 307 21 6 D A 94.26 244 ... (6 Replies)
Discussion started by: yifangt
6 Replies

4. Shell Programming and Scripting

How to remove duplicated lines?

Hi, if i have a file like this: Query=1 a a b c c c d Query=2 b b b c c e . . . (7 Replies)
Discussion started by: the_simpsons
7 Replies

5. Shell Programming and Scripting

Delete duplicated fields in a line

Hi, I have files with this kind of format (separator is space): A1 B1 C1 D1 E1 F1 D1 C1 G1 H1 A2 B2 C2 D2 E2 F2 D2 C2 G2 H2 A3 B3 C3 D3 E3 F3 G3 D3 C3 H3 A4 B4 C4 D4 E4 F4 G4 D4 C4 H4 I want the output to be: A1 B1 E1 F1 G1 H1 A2 B2 E2 F2 G2 H2 A3 B3 E3 F3 G3 H3 A4 B4 E4 F4 G4... (12 Replies)
Discussion started by: Gr4wk
12 Replies

6. Shell Programming and Scripting

Removing duplicated first field rows

Hello, I am trying to eliminate rows where the first field is duplicated, leaving the row where the last field is "NET". Data file: 345234|22.34|LST 546543|55.33|LST 793929|98.23|LST 793929|64.69|NET 149593|49.22|LST Desired output: 345234|22.34|LST 546543|55.33|LST... (2 Replies)
Discussion started by: palex
2 Replies

7. Shell Programming and Scripting

Merge files and remove duplicated rows

In a folder I'll several times daily receive new files that I want to combine into one big file, without any duplicate rows. The file name in the folder will look like e.q: MissingData_2014-08-25_09-30-18.txt MissingData_2014-08-25_09-30-14.txt MissingData_2014-08-26_09-30-12.txt The content... (9 Replies)
Discussion started by: Bergans
9 Replies

8. Shell Programming and Scripting

Remove rows containing commas with awk

Hello everyone, I have a dataset that looks something like: 1 3 2 2 3 4,5 4 3:9 5 5,9 6 5:6 I need to remove the rows that contain a comma in the second column and I'm not sure how to go about this. Here is an attempt. awk 'BEGIN {FS=" "} { if ($2!==,) print }'Any help is appreciated. (5 Replies)
Discussion started by: Rabu
5 Replies

9. Shell Programming and Scripting

awk to remove range of fields

I am trying to cut a range of fields in awk. The below seems to work for removing field 50, but what is the correct syntax for removing a range ($50-$62). Thank you :). awk awk 'BEGIN{FS=OFS="\t"}{$50=""; gsub(/\t\t/,"\t")}1' test.vcf.hg19_multianno.txt > output.csv Maybe: awk... (6 Replies)
Discussion started by: cmccabe
6 Replies

10. Shell Programming and Scripting

awk to remove lines where field count is greather than 1 in two fields

I am trying to remove all the lines and spaces where the count in $4 or $5 is greater than 1 (more than 1 letter). The file and the output are tab-delimited. Thank you :). file X 5811530 . G C NLGN4X 17 10544696 . GA G MYH3 9 96439004 . C ... (1 Reply)
Discussion started by: cmccabe
1 Replies
HTML::FormFu::Element::SimpleTable(3pm) 		User Contributed Perl Documentation		   HTML::FormFu::Element::SimpleTable(3pm)

NAME
HTML::FormFu::Element::SimpleTable - simple table element SYNOPSIS
The following is yaml markup for a table consisting of a header row containing 2 "th" cells, and a further 2 rows, each containing 2 "td" cells. type: SimpleTable headers: - One - Two rows: - - type: Input name: one_a - type: Input name: two_a - - type: Input name: one_b - type: Input name: two_b DESCRIPTION
Sometimes you just really need to use a table to display some fields in a grid format. As its name suggests, this is a compromise between power and simplicity. If you want more control of the markup, you'll probably just have to revert to using nested block's, setting the tags to table, tr, td, etc. and adding the cell contents as elements. METHODS
headers Input Value: @headers "headers" accepts an arrayref of strings. Each string is xml-escaped and inserted into a new header cell. rows Input Value: @rows "rows" accepts an array-ref, each item representing a new row. Each row should be comprised of an array-ref, each item representing a table cell. Each cell item should be appropriate for passing to "element" in HTML::FormFu; so either a single element's definition, or an array-ref of element definitions. odd_class Input Value: $string The supplied string will be used as the class-name for each odd-numbered row (not counting any header row). even_class Input Value: $string The supplied string will be used as the class-name for each even-numbered row (not counting any header row). SEE ALSO
Is a sub-class of, and inherits methods from HTML::FormFu::Element::Block, HTML::FormFu::Element HTML::FormFu AUTHOR
Carl Franks, "cfranks@cpan.org" LICENSE
This library is free software, you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2012-01-23 HTML::FormFu::Element::SimpleTable(3pm)
All times are GMT -4. The time now is 03:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy