Deleting only 2nd and third duplicates in field 2 Post: 302880411

Sponsored Content

Top Forums Shell Programming and Scripting Deleting only 2nd and third duplicates in field 2 Post 302880411 by newbie2010 on Thursday 19th of December 2013 12:03:35 PM

12-19-2013

Registered User

File order rearrangement /duplicate deletion

I have a file that has the following entries:

Code:

+==> FILER LISTING <==
deleting   /vol/icm_wks_0363
deleting   /vol/icm_wks_0365
deleting   /vol/icm_wks_0393
deleting   /vol/icm_wks_0394
deleting   /vol/icm_wks_0399
deleting   /vol/icm_wks_0416
deleting   /vol/icm_wks_0494
deleting   /vol/icm_wks_0501
deleting   /vol/truck_root
rearranging  /vol/icm_wks_0363
rearranging  /vol/icm_wks_0365
rearranging  /vol/icm_wks_0393
rearranging  /vol/icm_wks_0394
rearranging  /vol/icm_wks_0399
rearranging  /vol/icm_wks_0416
rearranging  /vol/icm_wks_0494
rearranging  /vol/icm_wks_0501
rearranging  /vol/truck_root

Here is what the list should look like:

Code:

rearranging  /vol/truck_root
rearranging  /vol/icm_wks_0501
rearranging  /vol/icm_wks_0494
rearranging  /vol/icm_wks_0399
rearranging  /vol/icm_wks_0394
rearranging  /vol/icm_wks_0393
rearranging  /vol/icm_wks_0365
rearranging  /vol/icm_wks_0363
deleting       /vol/icm_wks_0416

What I need to do is when there are two volumes of the same name in the second column only the "rearranging" one should be printed. But if there are three volumes of the same name in the second column, as with /vol/icm_wks_0416, then the "deleting /vol/icm_wks_0416" should be printed instead of "rearranging". The problem is that I have more than one list so the volume names won't always be the same.

I have tried variants of sort:

Code:

cat test-list |sort -ft/ -uk2

Code:

cat test-list |sort -r -ft/ -uk

The second command enables me to print out this:

Code:

cat test-list |sort -r -ft/ -uk2
rearranging  /vol/truck_root
rearranging  /vol/icm_wks_0501
rearranging  /vol/icm_wks_0494
rearranging  /vol/icm_wks_0416
rearranging  /vol/icm_wks_0399
rearranging  /vol/icm_wks_0394
rearranging  /vol/icm_wks_0393
rearranging  /vol/icm_wks_0365
rearranging  /vol/icm_wks_0363

That is almost right, except that the 0416 is not marked as deleting but as rearranging.

I have tried

Code:

 cat test-list  |gawk '!k[$2]++'

but this then only prints the 2nd column.

Also

Code:

gawk 'BEGIN { FS = " " } {count[$2]++; if (count[$2] == 1) first[$2] = $0;if (count[$2] ==2)print first[$2];if(count[$2] > 1)print}'

which does not work. Can any of you shed light on it?

newbie2010

View Public Profile for newbie2010

Find all posts by newbie2010

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicates in a sorted file by field.

I have data like this: It's sorted by the 2nd field (TID). envoy,90000000000000634600010001,04/11/2008,23:19:27,RB00266,0015,DETAIL,ERROR, envoy,90000000000000634600010001,04/12/2008,04:23:45,RB00266,0015,DETAIL,ERROR,...

2. Shell Programming and Scripting

Awk to find duplicates in 2nd field

I want to find duplicates in file on 2nd field i wrote this code: nawk '{a++} END{for i in a {if (a>1) print}}' temp Could not find whats wrong with this. Appreciate help

3. Shell Programming and Scripting

Sort alpha on 1st field, numerical on 2nd field (sci notation)

I want to sort alphabetically on the first field and sort in descending numerical order on the 2nd field. With a normal "sort -r -n" it does this: abc ||| 5e-05 ||| bla abc ||| 3 ||| ble def ||| 1 ||| abc def ||| 0.2 ||| def As you can see it ignores the fact that 5e-05 is actually 0.00005...

4. Shell Programming and Scripting

Extracting duplicates from a desired field

Hello, I have a file of group names and GID's (/etc/group) and I want to find the duplicate group names and put them in a file. So there are 2 fields, i.e.: audit 10 avahi 70 avahi-autoipd 103 bellrpi 605 bin 1 bin 2 bord 512 busobj 161 bwadm 230 cali81 202 card 323 cardiff 901 cbm...

5. Shell Programming and Scripting

Deleting Duplicates leaving the first entry

Hi, I need to delete duplicate records in a file that is around 30MB. Below is what I need. Below are the entries of input file and the output file that I need. Each section of input file is separated by an empty line. Some of these sections have duplicate uid values. I want to retain only one...

6. Shell Programming and Scripting

Remove the partial duplicates by checking the length of a field

Hi Folks - I'm quite new to awk and didn't come across such issues before. The problem statement is that, I've a file with duplicate records in 3rd and 4th fields. The sample is as below: aaaaaa|a12|45|56 abbbbaaa|a12|45|56 bbaabb|b1|51|45 bbbbbabbb|b2|51|45 aaabbbaaaa|a11|45|56 ...

7. UNIX for Dummies Questions & Answers

remove duplicates based on a field and criteria

Hi, I have a file with fields like below: A;XYZ;102345;222 B;XYZ;123243;333 C;ABC;234234;444 D;MNO;103345;222 E;DEF;124243;333 desired output: C;ABC;234234;444 D;MNO;103345;222 E;DEF;124243;333 ie, if the 4rth field is a duplicate.. i need only those records where...

8. Shell Programming and Scripting

Remove duplicates based on a field's value

Hi All, I have a text file with three columns. I would like a simple script that removes lines in which column 1 has duplicate entries, but use the largest value in column 3 to decide which one to keep. For example: Input file: 12345a rerere.rerere len=23 11111c fsdfdf.dfsdfdsf len=33 ...

9. Shell Programming and Scripting

Trying to remove duplicates based on field and row

I am trying to see if I can use awk to remove duplicates from a file. This is the file: -==> Listvol <== deleting /vol/eng_rmd_0941 deleting /vol/eng_rmd_0943 deleting /vol/eng_rmd_0943 deleting /vol/eng_rmd_1006 deleting /vol/eng_rmd_1012 rearrange /vol/eng_rmd_0943 ...

10. UNIX for Dummies Questions & Answers

Combine Similar Output from the 2nd field w.r.t 1st Field

Hi, For example: I have: HostA,XYZ HostB,XYZ HostC,ABC I would like the output to be: HostA,HostB: XYZ HostC:ABC How can I achieve this? So far what I though of is:

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicates in a sorted file by field.

Discussion started by: kinksville

2. Shell Programming and Scripting

Awk to find duplicates in 2nd field

Discussion started by: pinnacle

3. Shell Programming and Scripting

Sort alpha on 1st field, numerical on 2nd field (sci notation)

Discussion started by: FrancoisCN

4. Shell Programming and Scripting

Extracting duplicates from a desired field

Discussion started by: mgb

5. Shell Programming and Scripting

Deleting Duplicates leaving the first entry

Discussion started by: Samingla

6. Shell Programming and Scripting

Remove the partial duplicates by checking the length of a field

Discussion started by: asyed

7. UNIX for Dummies Questions & Answers

remove duplicates based on a field and criteria

Discussion started by: wanderingmind16

8. Shell Programming and Scripting

Remove duplicates based on a field's value

Discussion started by: anniecarv

9. Shell Programming and Scripting

Trying to remove duplicates based on field and row

Discussion started by: newbie2010

10. UNIX for Dummies Questions & Answers

Combine Similar Output from the 2nd field w.r.t 1st Field

Discussion started by: alvinoo