The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
.
google unix.com



UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
finding duplicates in columns and removing lines totus Shell Programming and Scripting 17 11-29-2008 11:27 AM
Remove duplicates from File from specific location gopikgunda Shell Programming and Scripting 1 04-09-2008 02:16 AM
Deleting specific columns from a file premar Shell Programming and Scripting 11 02-14-2006 07:02 AM
merging few columns of two text files to a new file kolvi Shell Programming and Scripting 4 09-15-2005 04:34 AM
Searching for text in files GandalfWhite Linux 2 01-21-2004 01:26 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 08-17-2005
Gerry405 Gerry405 is offline
Registered User
  
 

Join Date: Jun 2005
Location: Alexandria, Scotland
Posts: 46
searching text files on specific columns for duplicates

Is it possible to search through a large file full of rows and columns of text and retrieve only the rows that contain duplicates fields,

searchiing for duplicates on col4 & col6

Sample below

Col1 col2 col3 col4 col5 col6
G405H SURG FERGUSON SG00308258 01/16/52 GGHB
G405H ORTHO FERGUSON SG00308258 05/21/23 A&C
G405H ENT HOUGHTON SG03102407 04/22/70 GGHB
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
G405H GYN TAGGART SG03132070 05/15/53 GGHB

I would expect it the output to be

G405H SURG FERGUSON SG00308258 01/16/52 GGHB
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
  #2 (permalink)  
Old 08-17-2005
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,717
input file = filename
Code:
G405H SURG FERGUSON SG00308258 01/16/52 GGHB
G405H ORTHO FERGUSON SG00308258 05/21/23 A&C
G405H ENT HOUGHTON SG03102407 04/22/70 GGHB
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
G405H GYN TAGGART SG03132070 05/15/53 GGHB
output
Code:
 
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
G405H SURG FERGUSON SG00308258 01/16/52 GGHB
Code:
 
sort  -k 4.1,4.10 -k 6.1,6.4 filename |
awk ' {
    if (arr[ $4 $6 ])
    {print arr[ $4 $6 ];print $0}    
    else { arr [$4 $6 ]	= $0 }
    }' filename | sort -u

Last edited by jim mcnamara; 08-17-2005 at 05:45 PM..
  #3 (permalink)  
Old 08-18-2005
Gerry405 Gerry405 is offline
Registered User
  
 

Join Date: Jun 2005
Location: Alexandria, Scotland
Posts: 46
sorting

Jim,

Once again ...a big thanks to you, Unfortunately for me though I use Data General - Unix and the commands don't seem to have the -k option, but I'll play around with it once I have figured out what parts of your code is doing what...the other thing is the real input file has many date columns and commas etc ...

Which makes it a little more complicated

cheers

Last edited by Gerry405; 08-18-2005 at 11:12 AM..
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 05:16 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0