Sponsored Content
Top Forums Shell Programming and Scripting Identify the overlapping and non overlapping regions Post 302888830 by data_miner on Monday 17th of February 2014 03:06:27 PM
Old 02-17-2014
Identify the overlapping and non overlapping regions

Code:
file1
chr	pos1	pos2	pos3	pos4
1)chr1	1000	2000	3000	4000 
2)chr1	1380	1480	6800	7800	
3)chr1	6700	7700	1200	2200	
4)chr2	8500	9500	5670	6670

Code:
file2
chr	pos1	pos2	pos3	pos4
1)chr2	8500	9500	5000	6000	
2)chr1	6700	7700	1200	2200
3)chr1	1380	1480	6700	7700
4)chr1	1000	2000	4900	5900

I have 2 input files file1 and file2 each containing 5 columns. The first column contains the chromosomes (range from 1-19,X of which only chr1 and chr2 were shown in example).
what i want to do is
condition1 if chr pos1 and pos2 in both files overlap
then i want to compare the pos3 and pos4. if they (pos3 and pos4) overlap i want to output them to output_1file
and

if they (pos3 and pos4) wont overlap then output to output_2 file.
so if we compare file 1 with file2
Code:
output_1file
2)chr1	1380	1480	6800	7800
3)chr1	6700	7700	1200	2200
4)chr2	8500	9500	5670	6670

Code:
output_2file
1)chr1	1000	2000	3000	4000

my definition of overlap
The positions need not be exactly same. They should contain common region atleast by 1bp(base pair).
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

overlapping words on command line

i tried resize command , but it's not working...... (4 Replies)
Discussion started by: gaurav123
4 Replies

2. Shell Programming and Scripting

script to find non overlapping positions

Hi, I am a newbie in unix programming so maybe this is a simple question. I would like to know how can I make a script that outputs only the values that are not between any given start and end positions Example file1: 2 30 40 80 82 100 file2: ID1 1 ID2 35 ID3 80 ID4 81 ID6 160... (1 Reply)
Discussion started by: fadista
1 Replies

3. Shell Programming and Scripting

matching columns with overlapping value ranges

Hi, I want to match and print columns that match. So my file looks like this: h1 20 30 h1 25 27 h2 50 70 h2 90 95 h2 60 80 h2 70 75 h3 130 150 h3 177 190 h4 140 190 h4 300 305 So there are 6 columns. Column 1 and 4 are names. I am able to get the... (2 Replies)
Discussion started by: kylle345
2 Replies

4. UNIX Desktop Questions & Answers

non-overlapping terminals

Hi Everyone! I was wondering if there's an easy way to have terminals (gnome-terminal for instance) be open in such a way that they're not overlapping each other? I suppose I could play around with the --geometry option but that would imply me checking whether a terminal is already at a given... (3 Replies)
Discussion started by: anthalamus
3 Replies

5. Programming

Overlapping pictureboxes?

I am making a game, but I can't figure out how to put one image over the other. The background of the front image, covers up the picturebox under it. For example, I have two fish images, but when one is in front of the other, its background covers up the other fish. I attached a picture as an... (1 Reply)
Discussion started by: romeo5577
1 Replies

6. Solaris

shared memory overlapping

hey guys, i'm having trouble with a real time multi threaded program that uses lots of shared memory on solaris 8. it sometime crashes out of the blue, a randomness that suggests some sort of memory leak or shared memory overlap. any tips? freeware or otherwise useful software? any way i can... (2 Replies)
Discussion started by: princeofnothing
2 Replies

7. IP Networking

Test for overlapping IP ranges

Greetings folks, I have a rather lengthy list of banned IP ranges in iptables. Initially it was constructed as a rather ad-hoc affair, then later I discovered a site which had IP Block By Country lists, and imported several into iptables. If possible, I'd like to be able to verify if the list... (0 Replies)
Discussion started by: putter1900
0 Replies

8. UNIX for Dummies Questions & Answers

finding overlapping names in different txt files

Dear Gurus, I have 57 tab-delimited different text files, each one containing entries in 3 columns. The first column in each file contains names of objects. Some names are present in more than one file. I would like to find those names and store them in a separate text file, preferably with a... (6 Replies)
Discussion started by: Unilearn
6 Replies

9. UNIX for Dummies Questions & Answers

Merge two files with non-overlapping identities

Hi All, I wish to merge two files: file1: with header rsSNP-ID Chromosome Chr-Pos rs171 1 175261679 rs242 1 20869461 rs538 1 6160958 file2: without header disease:AAT deficiency:M0525101 rs1243168 20109307 1 disease:AAT deficiency:M0525101 rs4900229 20109307 1... (3 Replies)
Discussion started by: luoruicd
3 Replies

10. Shell Programming and Scripting

Assigning the names from overlapping regions

I have 2 files; file 1 having smaller positions that overlap with the positions with positions in file2. file1 aaa 20 22 apple aaa 18 25 banana aaa 12 30 grapes aaa 22 25 melon file2 aaa 18 26 cdded aaa 10 35 abcde I want to get something like this output aaa 18 26 cdded banana... (4 Replies)
Discussion started by: anurupa777
4 Replies
sort(1) 						      General Commands Manual							   sort(1)

Name
       sort - sort file data

Syntax
       sort [options] [-k keydef] [+pos1[-pos2]] [file...]

Description
       The  command  sorts  lines  of  all the named files together and writes the result on the standard output.  The name `-' means the standard
       input.  If no input files are named, the standard input is sorted.

Options
       The default sort key is an entire line.	Default ordering is lexicographic by  bytes  in  machine  collating  sequence.	 The  ordering	is
       affected globally by the following options, one or more of which may appear.

       -b	   Ignores leading blanks (spaces and tabs) in field comparisons.

       -d	   Sorts data according to dictionary ordering:  letters, digits, and blanks only.

       -f	   Folds uppercase to lowercase while sorting.

       -i	   Ignore characters outside the ASCII range 040-0176 in nonnumeric comparisons.

       -k keydef   The	keydefargument	is  a key field definition. The format is field_start, [field_end] [type], where field_start and field_end
		   are the definition of the restricted search key, and type is a modifier from the option list [bdfinr]. These modifiers have the
		   functionality, for this key only, that their command line counter-parts have for the entire record.

       -n	   Sorts fields with numbers numerically.  An initial numeric string, consisting of optional blanks, optional minus sign, and zero
		   or more digits with optional decimal point, is sorted by arithmetic value.  (Note that -0 is taken to be equal to 0.)  Option n
		   implies option b.

       -r	   Reverses the sense of comparisons.

       -tx	   Uses specified character as field separator.

       The  notation  +pos1 -pos2 restricts a sort key to a field beginning at pos1 and ending just before pos2.  Pos1 and pos2 each have the form
       m.n, optionally followed by one or more of the options bdfinr, where m tells a number of fields to skip from the beginning of the line  and
       n tells a number of characters to skip further.	If any options are present they override all the global ordering options for this key.	If
       the b option is in effect n is counted from the first nonblank in the field; b is attached independently to pos2.  A missing .n means .0; a
       missing	-pos2  means the end of the line.  Under the -tx option, fields are strings separated by x; otherwise fields are nonempty nonblank
       strings separated by blanks.

       When there are multiple sort keys, later keys are compared only after all earlier keys compare equal.  Lines that otherwise  compare  equal
       are ordered with all bytes significant.

       These are additional options:

       -c	   Checks sorting order and displays output only if out of order.

       -m	   Merges previously sorted data.

       -o name	   Uses specified file as output file.	This file may be the same as one of the inputs.

       -T dir	   Uses specified directory to build temporary files.

       -u	   Suppresses all duplicate entries.  Ignored bytes and bytes outside keys do not participate in this comparison.

Examples
       Print in alphabetical order all the unique spellings in a list of words.  Capitalized words differ from uncapitalized.
	       sort -u +0f +0 list

       Print the password file, sorted by user id number (the 3rd colon-separated field).
	       sort -t: +2n /etc/passwd

       Print the first instance of each month in an already sorted file of (month day) entries.  The options -um with just one input file make the
       choice of a unique representative from a set of equal lines predictable.
	       sort -um +0 -1 dates

Restrictions
       Very long lines are silently truncated.

Diagnostics
       Comments and exits with nonzero status for various trouble conditions and for disorder discovered under option c.

Files
       /usr/tmp/stm*, /tmp/*	first and second tries for temporary files

See Also
       comm(1), join(1), rev(1), uniq(1)

																	   sort(1)
All times are GMT -4. The time now is 08:47 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy