Matching column and search closest elements


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Matching column and search closest elements
# 1  
Old 07-09-2014
Matching column and search closest elements

Hi all
I have a great challenge that I am not able to resolve.
Briefly, I have a file like this:

Code:
ID_1 chr1 100 -
ID_2 chr2 300 +

and another file like this:

Code:
name_1 chr1 150 no -
name_2 chr1 250 yes -
name_3 chr2 350 yes +
name_4 chr2 280 yes +

Well, for each entry in file1 I would like to find the closest (cloumn 3) feature in file2.
So, for instance for entry1 in file1, I would like to check in file2 which is the element that is closest to "chr1 100" (the second column must match).
Moreover i would like to take in consideration only the element in file two in which the 4th column is "yes"(or at least I can have the possibility to decide this parameter) and the 5th column match with the entry in file1(or also in this case I have the possibility to decide this).

The output file for the example above should be (if I have 4th columns muast matches) like this:
Code:
ID_1 chr1 100 - name2 chr1 250 yes - 150 2
ID_2 chr2 300 + name4 chr2 280 yes + -20 1

So I would like to output all entry in file1 with the closest feature in file2 and report (last 2 column) the distance between column 3 and, for example for entry1, that the closest feature "yes" is the second met.

I really hope that my explanation wa good.

If you need furthr information let me know.
# 2  
Old 07-09-2014
What have you tried so far?
# 3  
Old 07-09-2014
I am not really able neither to try something like that!
:-(
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Matching column search in two files

Hi, I have a tab delimited file1: NC_013499.1 3180 3269 GQ342961.1 NC_030295.1 5925 6014 FN398100.2 NC_007915.1 6307 6396 KU529284.1 NC_013499.1 5033 5122 GQ342961.1 And a second file2: NC_030295.1 RefSeq gene 136 5115 ... (6 Replies)
Discussion started by: Ibk
6 Replies

2. UNIX for Beginners Questions & Answers

Add column and multiply its result to all elements of another column

Input file is as follows: 1 | 6 2 | 7 3 | 8 4 | 9 5 | 10 Output reuired (sum of the first column $1*$2) 1 | 6 | 90 2 | 7 | 105 3 | 8 | 120 4 |9 | 135 5 |10 | 150 Please enclose sample input, sample output, and code... (5 Replies)
Discussion started by: Sagar Singh
5 Replies

3. Shell Programming and Scripting

Matching column value from 2 different file using awk and append value from different column

Hi, I have 2 csv files. a.csv HUAWEI,20LMG011_DEKET_1296_RTN-980_IDU-1-11-ISV3-1(to LAMONGAN_M),East_Java,20LMG011_DEKET_1296_RTN-980_IDU-1,20LMG011,20LMG 027_1287_LAMONGAN_RTN980_IDU1,20LMG027,1+1(HSB),195.675,20LMG011-20LMG027,99.9995,202.6952012... (7 Replies)
Discussion started by: tententen
7 Replies

4. Shell Programming and Scripting

Matching column then append to existing File as new column

Good evening I have the below requirements, as I am not an experts in Linux/Unix and am looking for your ideas how I can do this. I have file called file1 and file2. I need to get the second column which is text1_random_alphabets and find that in file 2, if it's exists then print the 3rd... (4 Replies)
Discussion started by: mychbears
4 Replies

5. Shell Programming and Scripting

Count common elements in a column

HI, I have a 3-column tab separated column (approx 1GB) in which I would like to count and output the frequency of all of the common elements in the 1st column. For instance: If my input was the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 My... (4 Replies)
Discussion started by: owwow14
4 Replies

6. Shell Programming and Scripting

Find lines with matching column 1 value, retain only the one with highest value in column 2

I have a file like: I would like to find lines lines with duplicate values in column 1, and retain only one based on two conditions: 1) keep line with highest value in column 3, 2) if column 3 values are equal, retain the line with the highest value in column 4. Desired output: I was able to... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

7. Shell Programming and Scripting

Filtering lines for column elements based on corresponding counts in another column

Hi, I have a file like this ACC 2 2 21 aaa AC 443 3 22 aaa GCT 76 1 33 xxx TCG 34 2 33 aaa ACGT 33 1 22 ggg TTC 99 3 44 wee CCA 33 2 33 ggg AAC 1 3 55 ddd TTG 10 1 22 ddd TTGC 98 3 22 ddd GCT 23 1 21 sds GTC 23 4 32 sds ACGT 32 2 33 vvv CGT 11 2 33 eee CCC 87 2 44... (1 Reply)
Discussion started by: polsum
1 Replies

8. UNIX for Dummies Questions & Answers

Average for repeated elements in a column

I have a file that looks like this 452 025_E3 8 025_E3 82 025_F5 135 025_F5 5 025_F5 23 025_G2 38 025_G2 71 025_G2 9 026_A12 81 026_A12 10 026_A12 some of the elements in column2 are repeated. I want an output file that will extract the... (1 Reply)
Discussion started by: FelipeAd
1 Replies

9. Shell Programming and Scripting

Perl:Use of array elements in pattern matching

I need to use array elements while pattern matching. @myarr = (ELEM1, ELEM2, ELEM3); following is the statement which I am using in my code. Basically I want to replace the ELEM1/2/3 with other thing which is mentioned as REPL here. if (condition) { s/(ELEM1|ELEM2|ELEM3): REPL: /; } I... (3 Replies)
Discussion started by: deo_kaustubh
3 Replies

10. Shell Programming and Scripting

Search array elements as file for a matching string

I would like to find a list of files in a directory less than 2 days old and put them into an array variable. And then search for each file in the array for a matching string say "Return-code= 0". If it matches, then display the array element with a message as "OK". Your help will be greatly... (1 Reply)
Discussion started by: mkbaral
1 Replies
Login or Register to Ask a Question