Sponsored Content
Top Forums Shell Programming and Scripting Common records after matching on different columns Post 302596903 by jacobs.smith on Thursday 9th of February 2012 04:20:11 PM
Old 02-09-2012
Common records after matching on different columns

Hi,

I have the following files.

cat 1.txt

Quote:
chr1 100 200
chr1 200 300
chr1 1000 1200
chr2 300 400
chr2 400 500
chr2 600 900
chr2 1200 1800
chrz 100 200
chrz 300 400
chrz 400 500
cat 2.txt

Quote:
chr1 100 200
chr1 130 220
chr1 498 600
chr1 700 820
chr1 1499 1600
chr1 1800 1920
chr2 301 330
chr2 600 700
chrz 1000 1350
chrz 420 465
output.txt

Quote:
chr1 100 200 12.txt (because this record comes from both 1.txt and 2.txt)
chr1 200 300 1.txt
chr1 1000 1200 1.txt
chr1 130 220 2.txt
chr1 498 600 2.txt
chr1 700 820 2.txt
chr1 1499 1600 2.txt
chr2 300 400 1.txt
chr2 400 500 1.txt
chr2 600 900 1.txt
chr2 301 330 2.txt
chr2 600 700 2.txt
chrz 100 200 1.txt
chrz 300 400 1.txt
chrz 400 500 1.txt
chrz 420 465 2.txt



The logic is as follows....

chr1 in column1 of file1 should be matched to chr1 in column1 of file2.

Any value that is equal or 300 plus/minus range of the value in column2 of file1 matches to column2 of file2, (i.e., if column2 of file1 is 500, then the value in column2 of file2 can be 500, or between 200 and 500, or between 500 and 800) they should be printed.

Any value that is equal or 300 plus/minus range of the value in column3 of file1 matches to column3 of file2, (i.e., if column3 of file1 is 800, then the value in column3 of file2 can be 800, or between 500 and 800, or between 800 and 1100) they should be printed.

Also, anything that is in the range of column2 and column3 should be printed.
Ex: If file 1 has this record chr2 300 400, and file2 has this record chr1 301 383, both of them should be printed.

Each record is matched to each record in both these files.

I am looking for something that can be used across multiple files that are more than two.

Thanks a ton in advance. I know it is a pain. But, please help me.

---------- Post updated 02-09-12 at 09:59 AM ---------- Previous update was 02-08-12 at 02:18 PM ----------

Please guys. Someone help me out. SmilieSmilieSmilieSmilieSmilieSmilieSmilieSmilie

---------- Post updated at 04:19 PM ---------- Previous update was at 09:59 AM ----------

Any thoughts by anyone?

---------- Post updated at 04:20 PM ---------- Previous update was at 04:19 PM ----------

Any thoughts by anyone?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Common records using AWK

Hi, To be honest, I am really impressed and amazed at the pace I find solutions for un-solved coding mysteries in this forum. I have a file like this input1.txt x y z 1 2 3 a b c 4 -3 7 k l m n 0 p 1 2 a b c 4 input2 x y z 9 0 -1 a b c 0 6 9 k l m 8 o p 1 2 a f x 9 Output... (9 Replies)
Discussion started by: jacobs.smith
9 Replies

2. Shell Programming and Scripting

Matching and Merging csv data fields based on a common field

Dear List, I have a file of csv data which has a different line per compliance check per host. I do not want any omissions from this csv data file which looks like this: date,hostname,status,color,check 02-03-2012,COMP1,FAIL,Yellow,auth_pass_change... (3 Replies)
Discussion started by: landossa
3 Replies

3. Shell Programming and Scripting

Common records

Hi, I have the following files, A M 2 3 B E 4 5 C I 5 6 D O 4 5 A M 3 4 B E 5 2 F U 7 9 J K 2 3 OUTPUT A M 2 3 3 4 B E 4 5 5 2 thanks in advance, (7 Replies)
Discussion started by: jacobs.smith
7 Replies

4. UNIX for Dummies Questions & Answers

keeping last record among group of records with common fields (awk)

input: ref.1;rack.1;1 #group1 ref.1;rack.1;2 #group1 ref.1;rack.2;1 #group2 ref.2;rack.3;1 #group3 ref.2;rack.3;2 #group3 ref.2;rack.3;3 #group3 Among records from same group (i.e. with same 1st and 2nd field - separated by ";"), I would need to keep the last record... (5 Replies)
Discussion started by: beca123456
5 Replies

5. Shell Programming and Scripting

Two columns-Common records - 20 files

Hi Friends, I have an input file like this cat input1 x 1 y 2 z 3 a 2 b 4 c 6 d 9 cat input2 x 7 h 8 k 9 l 5 m 9 d 12 (5 Replies)
Discussion started by: jacobs.smith
5 Replies

6. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

7. Shell Programming and Scripting

Common values in 2 columns in 2 files

Hello, Suppose I have these 2 tab delimited files, where the second column in first file contains matching values from first column of the second file, I would like to get an output like this: File A 1 A 2 B 3 C File B A Apple C Cinnabon B Banana I would like... (1 Reply)
Discussion started by: Mohamed EL Hadi
1 Replies

8. Shell Programming and Scripting

Shell script to filter records in a zip file that contains matching columns from another file

Not sure if this is the correct forum for this question. I have two files. file1.zip, file2 Input: file1.zip col1, col2 , col3 a , b , 0:0:0:0:0:c436:9346:d40b x, y, 0:0:0:0:0:880:39f9:c9a7 m, n , 0:0:0:0:0:80c7:9161:fe00 file2.txt col1 c4:36:93:46:d4:0b... (1 Reply)
Discussion started by: anil.v
1 Replies

9. UNIX for Beginners Questions & Answers

Finding common entries between 10 columns

Hello, I need to find the intersection across 10 columns. Kindly help. my file (INPUT.csv) looks like this 4_R 4_S 8_R 8_S 12_R 12_S 24_R 24_S LOC_Os01g01010 LOC_Os01g01010 LOC_Os01g01010 LOC_Os04g48290 LOC_Os01g01010 LOC_Os01g01010... (1 Reply)
Discussion started by: Sanchari
1 Replies

10. UNIX for Beginners Questions & Answers

Comparing fastq files and outputting common records

I have two files: File_1: @M04961:22:000000000-B5VGJ:1:1101:9280:7106 1:N:0:86 GGCATGAAAACATACAAACCGTCTTTCCAGAAATTGTTCCAAGTATCGGCAACAGCTTTATCAATACCATGAAAAATATCAACCACACCAGAAGCAGCAT + GGGGGGGGGGGGGGGGGCCGGGGGF,EDFFGEDFG,@DGGCGGEGGG7DCGGGF68CGFFFGGGG@CGDGFFDFEFEFF:30CGAFFDFEFF8CAF;;8F ... (3 Replies)
Discussion started by: Xterra
3 Replies
All times are GMT -4. The time now is 01:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy