Match duplicate ids in two files


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Match duplicate ids in two files
# 1  
Old 07-03-2017
Match duplicate ids in two files

I have two text files. File 1 has 150 ids but all the ids exists in duplicates so it has 300 ids in total. File 2 has 1500 ids but all exists in duplicates so file 2 has 300 ids in total. i want to match the first occurance of every id in file 1 with first occurance of thet id in file 2 and 2nd occurance of id in file1 with the 2nd occurance of id in file 2. and based upon the value in column 2 print match or mismatch. Looking for an awk sed solution

Code:
File1
1 12
1 13
2 15
2 16
4 15 
4 18

File2
1 13
1 13
2 15
2 17
3 12
3 12
4 15 
4 18
5 14
5 14

Desired output (Id, col 2 from file 1, col 2 from file 2, match or mismatch)
1 12 13 mismatch
1 13 13 match
2 15 15 match
2 16 17 mismatch
4 15 15 match
4 18 18 match

# 2  
Old 07-03-2017
Hello limd,

Could you please try following and let me know if this helps you.
Code:
 awk 'FNR==NR{A[$1,$2]=$2;B[$1]=$2;next} (($1,$2) in A){print $0,A[$1,$2],"match";next} ($1 in B){print $0,B[$1],"mismatch"}' OFS="\t" Input_file2   Input_file1

Thanks,
R. Singh

Last edited by RavinderSingh13; 07-03-2017 at 05:37 AM..
# 3  
Old 07-03-2017
Try
Code:
awk '
NR==FNR         {T1[$1] = T1[$1] $2 FS
                 next
                }
                {T2[$1] = T2[$1] $2 FS
                }
END             {for (t in T1)  {n = split (T1[t], X1)
                                 m = split (T2[t], X2)
                                 for (i=1; i<=n; i++) print t, X1[i], X2[i], (X1[i]!=X2[i]?"mis":"") "match"
                                }
                }
' file1 file2
1 12 13 mismatch
1 13 13 match
2 15 15 match
2 16 17 mismatch
4 15 15 match
4 18 18 match

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. UNIX for Beginners Questions & Answers

List of all ids,groups, privilege ids

I wish to pull out a list of all user ids on the system, including the privileged ids, the groups to which they belong to. Sometimes after deleting an id also, its home dir does not get deleted or an entry is left behind in /etc/passwd. Can someone help me with a script to achieve both. (2 Replies)
Discussion started by: ggayathri
2 Replies

3. Shell Programming and Scripting

Match ids

Hello, I have two files File 1 with 10 columns rsid position ........ xx 1:10000 File 2 position 1:10000 2:2000 .... I need to extract the IDs given in file 2(column1) from file 1 (column2) and print all columns from file1. I am trying this command (1 Reply)
Discussion started by: nans
1 Replies

4. Shell Programming and Scripting

Match paragraph between two patterns, delete the duplicate paragraphs

Hello all I have a file my DNS server where there are duplicate paragrapsh like below. How can I remove the duplicate paragraph so that only one paragraph remains. BEGIN; replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER'); replace into records (domain_id,... (2 Replies)
Discussion started by: sb245
2 Replies

5. Shell Programming and Scripting

Merging files with common IDs without JOIN

Hi, I am trying to merge information across 2 files. The first file is a "master" file, with all IDS. File 2 contains a subset of IDs of those in File 1. I would like to match up individuals in File 1 and File 2, and add information in File 2 to that of File 1 if they appear. However, if an... (3 Replies)
Discussion started by: hubleo
3 Replies

6. Shell Programming and Scripting

Match ids and print original file

Hello, I have two files Original: ( 5000 entries) Chr Position chr1 879108 chr1 881918 chr1 896874 ... and a file with allele freq ( 2000 entries) Chr Position MAF chr1 881918 0.007 chr1 979748 0.007 chr1... (9 Replies)
Discussion started by: nans
9 Replies

7. UNIX for Dummies Questions & Answers

Extract columns by matching ids in two files

Hello, I want to extract columns from file2 to file3 by matching ids between file1 and file2. The extracted columns should be in same order as file1 ids. for example: file1.txt 1823 607 R2A9 802 771 file2.txt 1823 1 2 4 22 11 4 29 607 12 3 3 R2A9... (8 Replies)
Discussion started by: ryan9011
8 Replies

8. UNIX for Dummies Questions & Answers

Match values/IDs from column and text files

Hello, I am trying to modify 2 files, to yield results in a 3rd file. File-1 is a 8-columned file, separted with tab. 1234:1 xyz1234 blah blah blah blah blah blah 1234:1 xyz1233 blah blah blah blah blah blah 1234:1 abc1234 blah blah blah blah blah blah n/a RRR0000 blah blah blah... (1 Reply)
Discussion started by: ad23
1 Replies

9. Shell Programming and Scripting

Using AWK to match CSV files with duplicate patterns

Dear awk users, I am trying to use awk to match records across two moderately large CSV files. File1 is a pattern file with 173,200 lines, many of which are repeated. The order in which these lines are displayed is important, and I would like to preserve it. File2 is a data file with 456,000... (3 Replies)
Discussion started by: isuewing
3 Replies

10. UNIX for Dummies Questions & Answers

matching IDs from two files and editting

I have two files. One has: ID# 0 a b c d e f g h i j k....................~2 milion columns ID# 0 l m n o p q r s t u v....................~2 milion columns . . . ~6000 lines Other has: ID# 1 or ID# 2 . . ~6000 lines (2 Replies)
Discussion started by: polly_falconer
2 Replies
Login or Register to Ask a Question