Sponsored Content
Top Forums Shell Programming and Scripting Assigning the names from overlapping regions Post 302820869 by anurupa777 on Thursday 13th of June 2013 02:52:56 PM
Old 06-13-2013
Assigning the names from overlapping regions

I have 2 files; file 1 having smaller positions that overlap with the positions with positions in file2.
file1
Code:
aaa 20 22 apple
aaa 18 25 banana
aaa 12 30 grapes
aaa 22 25 melon

file2
Code:
aaa 18 26 cdded
aaa 10 35 abcde

I want to get something like this
output
Code:
aaa 18 26 cdded banana melon  
aaa 10 35 abcde apple  banana grapes melon

 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk: union regions

Hi all, I have difficulty to solve the followign problem. mydata: StartPoint EndPoint 22 55 2222 2230 33 66 44 58 222 240 11 25 22 60 33 45 The union of above... (2 Replies)
Discussion started by: phoeberunner
2 Replies

2. UNIX for Dummies Questions & Answers

finding overlapping names in different txt files

Dear Gurus, I have 57 tab-delimited different text files, each one containing entries in 3 columns. The first column in each file contains names of objects. Some names are present in more than one file. I would like to find those names and store them in a separate text file, preferably with a... (6 Replies)
Discussion started by: Unilearn
6 Replies

3. UNIX for Dummies Questions & Answers

extract regions of file based on start and end position

Hi, I have a file1 of many long sequences, each preceded by a unique header line. file2 is 3-columns list: headers name, start position, end position. I'd like to extract the sequence region of file1 specified in file2. Based on a post elsewhere, I found the code: awk... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

4. Forum Support Area for Unregistered Users & Account Problems

Trouble Registering? Countries or Regions Abusing Forums

The forums have been seeing a sharp increase in spam bots, forum robots, and malicious registrations from certain countries. If you have been directed to this thread due to a "No Permission Error" when trying to register please post in this thread and request permission to register, including... (1 Reply)
Discussion started by: Neo
1 Replies

5. Shell Programming and Scripting

Obtain the names of the flanking regions

Hi I have 2 files; usually the end position in the file1 is the start position in the file2 and the end position in file2 will be the start position in file1 (flanks) file1 Id start end aaa1 0 3000070 aaa1 3095270 3095341 aaa1 3100822 3100894 aaa1 ... (1 Reply)
Discussion started by: anurupa777
1 Replies

6. Shell Programming and Scripting

Identify the overlapping and non overlapping regions

file1 chr pos1 pos2 pos3 pos4 1)chr1 1000 2000 3000 4000 2)chr1 1380 1480 6800 7800 3)chr1 6700 7700 1200 2200 4)chr2 8500 9500 5670 6670 file2 chr pos1 pos2 pos3 pos4 1)chr2 8500 9500 5000 6000 2)chr1 6700 7700 1200 2200 3)chr1 1380 1480 6700 7700 4)chr1 1000 2000 4900 5900 I... (2 Replies)
Discussion started by: data_miner
2 Replies

7. Shell Programming and Scripting

Extraction of upstream and downstream regions from long sequence file

Hello, here I am posting my query again with modified data input files. see my query is : i have two input files file1 and file2. file1 is smalldata.fasta >gi|546671471|gb|AWWX01449637.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449636, whole genome shotgun sequence... (20 Replies)
Discussion started by: harpreetmanku04
20 Replies

8. UNIX for Dummies Questions & Answers

Retrieving names of files in a dir without overlapping

Hi, I have been trying to retrieve the names of files present in a directory one by one but the names of files are getting overlapped on one another. I tried the below command. ls -1 > filename please help me in getting the file names line by line without overlapping. I am using korn... (6 Replies)
Discussion started by: Pradhikshan
6 Replies

9. Shell Programming and Scripting

Extract Big and continuous regions

Hi all, I have a file like this I want to extract only those regions which are big and continous chr1 3280000 3440000 chr1 3440000 3920000 chr1 3600000 3920000 # region coming within the 3440000 3920000. so i don't want it to be printed in output chr1 3920000 4800000 chr1 ... (2 Replies)
Discussion started by: amrutha_sastry
2 Replies
cut(1)							      General Commands Manual							    cut(1)

Name
       cut - cut out selected fields of each line of a file

Syntax
       cut -clist [file1 file2...]
       cut -flist [-dchar] [-s] [file1 file2...]

Description
       Use  the  command to cut out columns from a table or fields from each line of a file.  The fields as specified by list can be fixed length,
       that is, character positions as on a punched card (-c option), or the length can vary from line to line and be marked with a  field  delim-
       iter character like tab (-f option).  The command can be used as a filter.  If no files are given, the standard input is used.

       Use to make horizontal ``cuts'' (by context) through a file, or to put files together in columns.  To reorder columns in a table, use and

Options
       list	   Specifies  ranges  that must be a comma-separated list of integer field numbers in increasing order.  With optional - indicates
		   ranges as in the -o option of nroff/troff for page ranges; for example, 1,4,7; 1-3,8; -5,10 (short for 1-5,10);  or	3-  (short
		   for third through last field).

       -clist	   Specifies character positions to be cut out.  For example, -c1-72 would pass the first 72 characters of each line.

       -flist	   Specifies  the  fields  to be cut out.  For example, -f1,7 copies the first and seventh field only.	Lines with no field delim-
		   iters are passed through intact (useful for table subheadings), unless -s is specified.

       -dchar	   Uses the specified character as the field delimiter.  Default is tab.  Space or other characters with special  meaning  to  the
		   shell must be quoted.  The -d option is used only in combination with the -f option, according to XPG3 and SVID2/SVID3.

       -s	   Suppresses  lines  with  no	delimiter  characters.	 Unless  specified, lines with no delimiters are passed through untouched.
		   Either the -c or -f option must be specified.

Examples
       Mapping of user IDs to names:
       cut -d: -f1,5 /etc/passwd
       To set name to the current login name for the csh shell:
       set name=`who am i | cut -f1 -d" "`
       To set name to the current login name for the sh, sh5, and ksh shells:
       name=`who am i | cut -f1 -d" "`

Diagnostics
       "line too long"	   A line can have no more than 511 characters or fields.

       "bad list for c/f option"
			   Missing -c or -f option or incorrectly specified list.  No error occurs if a line has fewer fields than the list  calls
			   for.

       "no fields"	   The list is empty.

See Also
       grep(1), paste(1)

																	    cut(1)
All times are GMT -4. The time now is 08:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy