Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 12-11-2012
sudon't's Avatar
Registered User
 
Join Date: May 2012
Location: The Cape Fear...ooooh!
Posts: 75
Thanks: 46
Thanked 0 Times in 0 Posts
Remove Doubles Without Sort?

Hi!
I have concatenated two files which are wordlists, i.e., one word per line. The new file contains some doubles, but I cannot use sort and uniq as I need to keep the sort order that it is already in, which is not alphabetical, and uniq only compares adjacent lines, and the doubles are not on adjacent lines. Is there another simple way to remove doubles without altering the sort order? Unfortunately, there is no common pattern I can use to pick them out.
Sponsored Links
    #2  
Old 12-11-2012
Yoda's Avatar
Jedi Master
 
Join Date: Jan 2012
Location: Galactic Empire
Posts: 2,315
Thanks: 154
Thanked 740 Times in 712 Posts

Code:
awk '!arr[$0]++' wordlist_file

The Following User Says Thank You to Yoda For This Useful Post:
sudon't (12-12-2012)
Sponsored Links
    #3  
Old 12-11-2012
sudon't's Avatar
Registered User
 
Join Date: May 2012
Location: The Cape Fear...ooooh!
Posts: 75
Thanks: 46
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by bipinajith View Post
Code:
awk '!arr[$0]++' wordlist_file

Hey bipinajith, thanks for your reply! Would you mind explaining how that pattern works? I thought I knew a little about regexes, but I've never seen anything like that.
    #4  
Old 12-11-2012
rdcwayx rdcwayx is offline Forum Advisor  
Use nawk in Solaris
 
Join Date: Jun 2006
Posts: 2,700
Thanks: 43
Thanked 405 Times in 394 Posts
see the explanation:
http://www.unix.com/302678079-post2.html
The Following User Says Thank You to rdcwayx For This Useful Post:
sudon't (12-12-2012)
Sponsored Links
    #5  
Old 12-11-2012
Mead Rotor
 
Join Date: Aug 2005
Location: Saskatchewan
Posts: 16,407
Thanks: 492
Thanked 2,538 Times in 2,421 Posts
Quote:
Originally Posted by sudon't View Post
I thought I knew a little about regexes, but I've never seen anything like that.
I'd be more worried if you had, as it's not a regex. It's more like C than anything.

It's an array with a string as the index. It checks if it's zero, then adds to it. The first time the index appears, it will print, the next times it won't.
The Following User Says Thank You to Corona688 For This Useful Post:
sudon't (12-12-2012)
Sponsored Links
    #6  
Old 12-11-2012
sudon't's Avatar
Registered User
 
Join Date: May 2012
Location: The Cape Fear...ooooh!
Posts: 75
Thanks: 46
Thanked 0 Times in 0 Posts
Quote:
Originally Posted by rdcwayx View Post
Whew! I kinda think I get it. At least, until I try to type out my own explanation. You know, I think I'm going to read something about awk and come back tomorrow.
Sponsored Links
    #7  
Old 12-11-2012
...@...
 
Join Date: Feb 2004
Location: NM
Posts: 9,660
Thanks: 165
Thanked 647 Times in 624 Posts
Look up associative array: Associative array - Wikipedia, the free encyclopedia
The Following User Says Thank You to jim mcnamara For This Useful Post:
sudon't (12-12-2012)
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk syntax mistake doubles desired output awknewb123 Shell Programming and Scripting 6 03-07-2012 12:49 PM
remove duplicates and sort dvah Shell Programming and Scripting 6 04-06-2011 11:14 AM
need help writing a program to look for doubles rickym2626 UNIX Desktop for Dummies Questions & Answers 2 04-13-2009 03:46 PM
How to remove duplicate records with out sort svenkatareddy Shell Programming and Scripting 19 06-11-2008 02:10 PM
long doubles crashnburn Programming 1 12-19-2002 10:12 PM



All times are GMT -4. The time now is 05:39 PM.