Hi all
I have a great challenge that I am not able to resolve.
Briefly, I have a file like this:
and another file like this:
Well, for each entry in file1 I would like to find the closest (cloumn 3) feature in file2.
So, for instance for entry1 in file1, I would like to check in file2 which is the element that is closest to "chr1 100" (the second column must match).
Moreover i would like to take in consideration only the element in file two in which the 4th column is "yes"(or at least I can have the possibility to decide this parameter) and the 5th column match with the entry in file1(or also in this case I have the possibility to decide this).
The output file for the example above should be (if I have 4th columns muast matches) like this:
So I would like to output all entry in file1 with the closest feature in file2 and report (last 2 column) the distance between column 3 and, for example for entry1, that the closest feature "yes" is the second met.
I would like to find a list of files in a directory less than 2 days old and put them into an array variable. And then search for each file in the array for a matching string say "Return-code= 0". If it matches, then display the array element with a message as "OK".
Your help will be greatly... (1 Reply)
I need to use array elements while pattern matching.
@myarr = (ELEM1, ELEM2, ELEM3);
following is the statement which I am using in my code. Basically I want to replace the ELEM1/2/3 with other thing which is mentioned as REPL here.
if (condition) {
s/(ELEM1|ELEM2|ELEM3): REPL: /;
}
I... (3 Replies)
I have a file that looks like this
452 025_E3
8 025_E3
82 025_F5
135 025_F5
5 025_F5
23 025_G2
38 025_G2
71 025_G2
9 026_A12
81 026_A12
10 026_A12
some of the elements in column2 are repeated.
I want an output file that will extract the... (1 Reply)
I have a file like:
I would like to find lines lines with duplicate values in column 1, and retain only one based on two conditions: 1) keep line with highest value in column 3, 2) if column 3 values are equal, retain the line with the highest value in column 4.
Desired output:
I was able to... (3 Replies)
HI, I have a 3-column tab separated column (approx 1GB) in which I would like to count and output the frequency of all of the common elements in the 1st column.
For instance:
If my input was the following:
dot is-big 2
dot is-round 3
dot is-gray 4
cat is-big 3
hot in-summer 5
My... (4 Replies)
Good evening
I have the below requirements, as I am not an experts in Linux/Unix and am looking for your ideas how I can do this.
I have file called file1 and file2.
I need to get the second column which is text1_random_alphabets and find that in file 2, if it's exists then print the 3rd... (4 Replies)
Hi,
I have a tab delimited file1:
NC_013499.1 3180 3269 GQ342961.1
NC_030295.1 5925 6014 FN398100.2
NC_007915.1 6307 6396 KU529284.1
NC_013499.1 5033 5122 GQ342961.1
And a second file2:
NC_030295.1 RefSeq gene 136 5115 ... (6 Replies)
Discussion started by: Ibk
6 Replies
LEARN ABOUT OSX
comm
COMM(1) BSD General Commands Manual COMM(1)NAME
comm -- select or reject lines common to two files
SYNOPSIS
comm [-123i] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1;
lines only in file2; and lines in both files.
The filename ``-'' means the standard input.
The following options are available:
-1 Suppress printing of column 1.
-2 Suppress printing of column 2.
-3 Suppress printing of column 3.
-i Case insensitive comparison of lines.
Each column will have a number of tab characters prepended to it equal to the number of lower numbered columns that are being printed. For
example, if column number two is being suppressed, lines printed in column number one will not have any tabs preceding them, and lines
printed in column number three will have one.
The comm utility assumes that the files are lexically sorted; all characters participate in line comparisons.
ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE, and LC_CTYPE environment variables affect the execution of comm as described in environ(7).
EXIT STATUS
The comm utility exits 0 on success, and >0 if an error occurs.
SEE ALSO cmp(1), diff(1), sort(1), uniq(1)STANDARDS
The comm utility conforms to IEEE Std 1003.2-1992 (``POSIX.2'').
The -i option is an extension to the POSIX standard.
HISTORY
A comm command appeared in Version 4 AT&T UNIX.
BUGS
Input lines are limited to LINE_MAX (2048) characters in length.
BSD January 26, 2005 BSD