Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 06-10-2012
A-V A-V is offline
Registered User
 
Join Date: May 2012
Posts: 103
Thanks: 54
Thanked 2 Times in 2 Posts
Question Delete rows with unique value for specific column

Hi all
I have a file which looks like this

Code:
1234|1|Jon|some text|some text
1234|2|Jon|some text|some text
3453|5|Jon|some text|some text
6533|2|Kate|some text|some text
4567|3|Chris|some text|some text
4567|4|Maggie|some text|some text
8764|6|Maggie|some text|some text

My third column is my KEY and I want to only print lines of the file if the KEY has been printed more than once. So basically any unique entry for column three can be deleted.

Code:
So the output would look like this
1234|1|Jon|some text|some text
1234|2|Jon|some text|some text
3453|5|Jon|some text|some text
4567|4|Maggie|some text|some text
8764|6|Maggie|some text|some text

Can you please help me?

Moderator's Comments:
Please use [code]...[/code] tags instead of [quote]...[/quote] tags for code and samples

Last edited by Scrutinizer; 06-10-2012 at 11:59 PM.. Reason: code tags instead of code tags
Sponsored Links
    #2  
Old 06-11-2012
Ygor's Avatar
Ygor Ygor is offline Forum Staff  
Moderator
 
Join Date: Oct 2003
Location: 54.23, -4.53
Posts: 1,792
Thanks: 1
Thanked 101 Times in 91 Posts
Try...
Code:
$ cat file1
1234|1|Jon|some text|some text
1234|2|Jon|some text|some text
3453|5|Jon|some text|some text
6533|2|Kate|some text|some text
4567|3|Chris|some text|some text
4567|4|Maggie|some text|some text
8764|6|Maggie|some text|some text

$ awk 'NR==FNR{a[$3]++;next}a[$3]>1' FS='|' file1 file1
1234|1|Jon|some text|some text
1234|2|Jon|some text|some text
3453|5|Jon|some text|some text
4567|4|Maggie|some text|some text
8764|6|Maggie|some text|some text

$

The Following User Says Thank You to Ygor For This Useful Post:
A-V (06-12-2012)
Sponsored Links
    #3  
Old 06-12-2012
A-V A-V is offline
Registered User
 
Join Date: May 2012
Posts: 103
Thanks: 54
Thanked 2 Times in 2 Posts
fortunately it doesn't do any anything on my file
so even putting into a file it returns an empty file

oooppsssss ... my mistake
I only put the file1 once
why we need to put it twice? is it for comparison?

Thanks for your help

---------- Post updated at 08:31 AM ---------- Previous update was at 08:16 AM ----------

i tested it on my documents

somehow, it does not delete all the single lines so I do steel have unique data
on the other hand it deletes one row from the non-unique ones as well so if i have two james on file one, in output i have 1 james only

any suggestion?

Last edited by A-V; 06-12-2012 at 09:23 AM..
    #4  
Old 06-12-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
awk

Hi,

Try this one,


Code:
awk 'BEGIN{FS="|";}{a[$3]++;if(a[$3]==2)print v[$3] ORS $0;if(a[$3]>2)print;v[$3]=$0;}' file

Cheers,
Ranga
The Following User Says Thank You to rangarasan For This Useful Post:
A-V (06-12-2012)
Sponsored Links
    #5  
Old 06-12-2012
A-V A-V is offline
Registered User
 
Join Date: May 2012
Posts: 103
Thanks: 54
Thanked 2 Times in 2 Posts
Dear Ranga

its worked as a charm

Thank you so much
Cheers
A-V
Sponsored Links
    #6  
Old 06-13-2012
Registered User
 
Join Date: Nov 2011
Posts: 76
Thanks: 14
Thanked 1 Time in 1 Post
Delete rows with unique value for specific column

Hi Ranga,

Good One but can you please compeltly how this logic works?


Code:
awk 'BEGIN{FS="|";}{a[$3]++;if(a[$3]==2)print v[$3] ORS $0;if(a[$3]>2)print;v[$3]=$0;}' file

Thanks
Krsna
The Following User Says Thank You to krsnadasa For This Useful Post:
A-V (06-14-2012)
Sponsored Links
    #7  
Old 06-13-2012
rangarasan's Avatar
Registered User
 
Join Date: Jul 2011
Location: Chennai, India
Posts: 484
Thanks: 9
Thanked 119 Times in 115 Posts
awk

Quote:
Originally Posted by krsnadasa View Post
Hi Ranga,

Good One but can you please compeltly how this logic works?


Code:
awk 'BEGIN{FS="|";}{a[$3]++;if(a[$3]==2)print v[$3] ORS $0;if(a[$3]>2)print;v[$3]=$0;}' file

Thanks
Krsna

Code:
a[$3]++; - Store the no of repeat counts with name as a key.
v[$3]=$0; - store the previous line
if(a[$3]==2) - If repeat count is more than one(must be 2), then print 
previous line(first occurence) and current line(second occurence).
if(a[$3]>2) - Just print the current line.

Hope i explained clearly.

Cheers,
Ranga
The Following User Says Thank You to rangarasan For This Useful Post:
A-V (06-14-2012)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
delete a row with a specific value at a certain column kylle345 Shell Programming and Scripting 3 03-30-2012 12:02 AM
Delete all rows that contain a specific string (text) evelibertine UNIX for Dummies Questions & Answers 9 06-16-2011 11:20 PM
Delete a specific column using vi editor? fnebiolo UNIX for Dummies Questions & Answers 3 10-25-2010 03:28 AM
Print rows, having pattern in specific column... admax Shell Programming and Scripting 25 10-06-2009 06:14 AM
how to delete duplicate rows based on last column reva Shell Programming and Scripting 16 09-01-2009 09:12 AM



All times are GMT -4. The time now is 04:55 PM.