searching text files on specific columns for duplicates


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers searching text files on specific columns for duplicates
# 1  
Old 08-17-2005
searching text files on specific columns for duplicates

Is it possible to search through a large file full of rows and columns of text and retrieve only the rows that contain duplicates fields,

searchiing for duplicates on col4 & col6

Sample below

Col1 col2 col3 col4 col5 col6
G405H SURG FERGUSON SG00308258 01/16/52 GGHB
G405H ORTHO FERGUSON SG00308258 05/21/23 A&C
G405H ENT HOUGHTON SG03102407 04/22/70 GGHB
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
G405H GYN TAGGART SG03132070 05/15/53 GGHB

I would expect it the output to be

G405H SURG FERGUSON SG00308258 01/16/52 GGHB
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
# 2  
Old 08-17-2005
input file = filename
Code:
G405H SURG FERGUSON SG00308258 01/16/52 GGHB
G405H ORTHO FERGUSON SG00308258 05/21/23 A&C
G405H ENT HOUGHTON SG03102407 04/22/70 GGHB
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
G405H GYN TAGGART SG03132070 05/15/53 GGHB

output
Code:
 
G405H ENT HOUGHTON SG00308258 10/08/60 GGHB
G405H SURG FERGUSON SG00308258 01/16/52 GGHB

Code:
 
sort  -k 4.1,4.10 -k 6.1,6.4 filename |
awk ' {
    if (arr[ $4 $6 ])
    {print arr[ $4 $6 ];print $0}    
    else { arr [$4 $6 ]	= $0 }
    }' filename | sort -u


Last edited by jim mcnamara; 08-17-2005 at 06:45 PM..
# 3  
Old 08-18-2005
sorting

Jim,

Once again ...a big thanks to you, Unfortunately for me though I use Data General - Unix and the commands don't seem to have the -k option, but I'll play around with it once I have figured out what parts of your code is doing what...the other thing is the real input file has many date columns and commas etc ...

Which makes it a little more complicated

cheers

Last edited by Gerry405; 08-18-2005 at 12:12 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Searching the value of a specific attribute among xmls files from a particular directory location

Hi Folks , I have the different xml files at the following directory `/opt/app/rty/servers/tr/current/ops/config` Let's say there are three files named abc.xml bv.xml ert.xml Now inside these xml there can be many tags as like shown below <bean id="sdrt"... (6 Replies)
Discussion started by: unclesamm
6 Replies

2. Shell Programming and Scripting

Add specific text to columns in file by sed

What is the proper syntax to add specific text to a column in a file? Both the input and output below are tab-delineated. What if there are multiple text/fields, such as /CP&/2 /CM&/3 /AA&/4 Thank you :). sed 's/*/Index&/1' del.txt.hg19_multianno.txt > matrix.del.txt (4 Replies)
Discussion started by: cmccabe
4 Replies

3. Red Hat

Moving of file content to another two files after searching with specific pattern

Hello, Please help me with this!! Thanks in advance!! I have a file named file.gc with the content: 1-- Mon Sep 10 08:53:09 CDT 2012 2revoke connect from FR2261; 3delete from mkt_allow where grantee = 'FR2261'; 4grant connect to FR2261 with '******'; 5alter user FR2261 comment... (0 Replies)
Discussion started by: raosr020
0 Replies

4. UNIX for Dummies Questions & Answers

How do you view specific columns from a space delimited text file?

I have a space delimited text file with 1,000,000+ columns? I would only like to view specific ones (let's say through 1:10), how can I do that? Thanks! (3 Replies)
Discussion started by: evelibertine
3 Replies

5. Shell Programming and Scripting

Searching for files with specific extensions

Hi, Could someone give me a hand with a search for files with two possible extensions, please. The requirement is simple - I need to issue a single ls command searching for files with the suffix of, say, *.txt and *.log. I've tried to use ls *.txt *.log which works if there are both... (4 Replies)
Discussion started by: neilmw
4 Replies

6. Shell Programming and Scripting

remove white space from specific columns in text file

Hello i have a text file like this: 1 AB AC AD EE 2 WE TR YT WW 3 AS UY RF YT the file is bigger , but that's an example of the data what i want to do is to merge all columns together except the first one, it will become like this : 1 ABACADEE 2 WETRYTWW 3 ASUYRFYT (8 Replies)
Discussion started by: shelladdict
8 Replies

7. UNIX for Advanced & Expert Users

How to perform Grep on many Gzip files, Searching for Specific information

Hello, I am wondering if you can assist with my question and ask kindly for this. I have a number of files that are listed as file1.gz through file100.gz. I am trying to perform a grep on the files and find a specific date that only resides within within one of the files. There are... (3 Replies)
Discussion started by: legharb
3 Replies

8. Shell Programming and Scripting

Replacing text on specific columns 2nd

Hi , I'm so surprised to see my thread closed ! i come here many times and work with some great guys like Perderabo , each time i search for many solutions to write some scripts for my job , at this time i don't see how i can solve this one , so please don't take me as a pupil ! i can test... (16 Replies)
Discussion started by: Nicol
16 Replies

9. Shell Programming and Scripting

Replacing text on specific columns

Hi , i have one file like : APP101A APP103B MSG307A MSG308B XADM002A and so on , and i expect to replace the last character A by B and B by A the result would be : APP101B APP103A MSG307B MSG308A XADM002B how can i do that ? (3 Replies)
Discussion started by: Nicol
3 Replies
Login or Register to Ask a Question