Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Cannot subset ranges from another range set Post 303042670 by cryptodice on Friday 3rd of January 2020 04:22:49 AM
Old 01-03-2020
Cannot subset ranges from another range set

Code:
Ca21chr2_C_albicans_SC5314	2159343	2228327	Ca21chr2_C_albicans_SC5314	636587	638608
Ca21chr2_C_albicans_SC5314	5286	50509	Ca21chr2_C_albicans_SC5314	634021	636276
Ca21chr2_C_albicans_SC5314	1886545	1900975	Ca21chr2_C_albicans_SC5314	610758	613544
Ca21chr2_C_albicans_SC5314	1919115	1930649	Ca21chr2_C_albicans_SC5314	606248	608308
Ca21chr2_C_albicans_SC5314	590278	603163	Ca21chr2_C_albicans_SC5314	1554724	1556511
Ca21chr2_C_albicans_SC5314	267403	279993	Ca21chr2_C_albicans_SC5314	1547799	1548998
Ca21chr2_C_albicans_SC5314	1611869	1622753	Ca21chr2_C_albicans_SC5314	1519257	1520960
Ca21chr2_C_albicans_SC5314	1479229	1490747	Ca21chr2_C_albicans_SC5314	1514712	1516178
Ca21chr2_C_albicans_SC5314	157814	166956	Ca21chr2_C_albicans_SC5314	897896	900774
Ca21chr2_C_albicans_SC5314	2148223	2149627	Ca21chr2_C_albicans_SC5314	890821	892818
Ca21chr2_C_albicans_SC5314	1041578	1051493	Ca21chr2_C_albicans_SC5314	588237	589598
Ca21chr2_C_albicans_SC5314	736894	745664	Ca21chr2_C_albicans_SC5314	557079	558713
Ca21chr2_C_albicans_SC5314	618550	627903	Ca21chr2_C_albicans_SC5314	7510	8043
Ca21chr2_C_albicans_SC5314	1116919	1125425	Ca21chr2_C_albicans_SC5314	922654	924717
Ca21chr2_C_albicans_SC5314	1262940	1271939	Ca21chr2_C_albicans_SC5314	1778986	1779687
Ca21chr2_C_albicans_SC5314	288630	296284	Ca21chr2_C_albicans_SC5314	795730	798201
Ca21chr2_C_albicans_SC5314	1250513	1258731	Ca21chr2_C_albicans_SC5314	766651	768309
Ca21chr2_C_albicans_SC5314	1499806	1508334	Ca21chr2_C_albicans_SC5314	763501	765159
Ca21chr2_C_albicans_SC5314	98269	105803	Ca21chr2_C_albicans_SC5314	758203	758733
Ca21chr2_C_albicans_SC5314	1604362	1611315	Ca21chr2_C_albicans_SC5314	700893	702539

This is a snippet of my data. What I want to do is to find out if the range of column 5 and column 6 is a subset of the range between column 2 and column 3. The data in column 2 and 3 are longer than data in column 5 and 6. A script has to scan through columns 2 and 3 in totality for every range defined by column 5 and 6. How do I do it. Any awk scripts? I am sorry if I did not follow the forum's rules, this is my first time using it.

Last edited by vbe; 01-03-2020 at 05:25 AM.. Reason: code tags - not quotes please
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

print range between two patterns if it contains a pattern within the range

I want to print between the range two patterns if a particular pattern is present in between the two patterns. I am new to Unix. Any help would be greatly appreciated. e.g. Pattern1 Bombay Calcutta Delhi Pattern2 Pattern1 Patna Madras Gwalior Delhi Pattern2 Pattern1... (2 Replies)
Discussion started by: joyan321
2 Replies

2. UNIX for Dummies Questions & Answers

Help with subset and if-then statements

Hello everyone. I'm new to the boards, I hope I can get and possibly give some help through these forums. I need some help. I have two CSV files, let's call them File A and File B. This is the structure for File A: ID, VAR1, VAR2, VAR3 - VAR50 (where the VAR 1-VAR50 are either 0 or 1) ... (1 Reply)
Discussion started by: JWill
1 Replies

3. Shell Programming and Scripting

use variable to set the range of a for loop

Hi; For sure there's an easy answer to this one that I am not finding.. I first set a variable, say b1a:] max=5 then I want to use max to set the range for a for loop like so (it should run for i in 1:5) b1a:] for i in {1..$max}; do echo $i; done {1..5} I would like the output... (2 Replies)
Discussion started by: jbr950
2 Replies

4. UNIX for Dummies Questions & Answers

how to get a subset of such a file

Dear all, I have a file lik below: n of row=420, n of letters in each row=100000 like below: there is no space between the letters. what I want is: the 75000th letter to the 85000th letter in each row. how to do that? thanks a lot! ... (2 Replies)
Discussion started by: forevertl
2 Replies

5. Shell Programming and Scripting

sed filtering lines by range fails 1-line-ranges

The following is part of a larger project and sed is (right now) a given. I am working on a recursive Korn shell function to "peel off" XML tags from a larger text. Just for context i will show the complete function (not working right now) here: function pGetXML { typeset chTag="$1" typeset... (5 Replies)
Discussion started by: bakunin
5 Replies

6. Shell Programming and Scripting

Generate Regex numeric range with specific sub-ranges

hi all, Say i have a range like 0 - 1000 and i need to split into diffrent files the lines which are within a specific fixed sub-range. I can achieve this manually but is not scalable if the range increase. E.g cat file1.txt Response time 2 ms Response time 15 ms Response time 101... (12 Replies)
Discussion started by: varu0612
12 Replies

7. UNIX for Dummies Questions & Answers

How to subset data?

Hi. I have a large data file. the first column has unique identifiers. I have approximately 5 of these files and they have varying number of columns in their rows. I need to extract ~300 of the rows in to a separate file. I'm not looking for something that would do all 5 files at once, but... (7 Replies)
Discussion started by: kadm
7 Replies

8. Shell Programming and Scripting

How to set end limit while copying files of a range??

I have files being generated in format A20140326.00........ to A20140326.24............. I need to copy these hourly basis from one location to another. Eg. If i copy from 14 to 19 the hour, I use wildcard as A201403226.1*. Requirement is : I need to copy from 06 hour and wil run the script... (1 Reply)
Discussion started by: Saidul
1 Replies

9. Red Hat

Which is the effective ephemeral port range in Linux 2.6 for this set up?

In my Linux system ephemeral port range is showing different ranges as follows $ cat /proc/sys/net/ipv4/ip_local_port_range 32768 61000  cat /etc/sysctl.conf | grep net.ipv4.ip_local_port_range net.ipv4.ip_local_port_range = 9000 65500 Which will be the effective ephemeral port... (5 Replies)
Discussion started by: steephen
5 Replies

10. Shell Programming and Scripting

Help with sum range of data set together

Input File: 2000 3 1998 2 1997 2 1994 1 1991 1 1989 1 1987 2 1986 2 1985 1 1984 1 . . 10 277256 9 278274 8 282507 7 284837 6 287066 5 292967 (4 Replies)
Discussion started by: perl_beginner
4 Replies
PSC(1)							      General Commands Manual							    PSC(1)

NAME
psc - prepare sc files SYNOPSIS
psc [-fLkrSPv] [-s cell] [-R n] [-C n] [-n n] [-d c] DESCRIPTION
Psc is used to prepare data for input to the spreadsheet calculator sc(1). It accepts normal ascii data on standard input. Standard out- put is a sc file. With no options, psc starts the spreadsheet in cell A0. Strings are right justified. All data on a line is entered on the same row; new input lines cause the output row number to increment by one. The default delimiters are tab and space. The column for- mats are set to one larger than the number of columns required to hold the largest value in the column. OPTIONS
-f Omit column width calculations. This option is for preparing data to be merged with an existing spreadsheet. If the option is not specified, the column widths calculated for the data read by psc will override those already set in the existing spreadsheet. -L Left justify strings. -k Keep all delimiters. This option causes the output cell to change on each new delimiter encountered in the input stream. The default action is to condense multiple delimiters to one, so that the cell only changes once per input data item. -r Output the data by row first then column. For input consisting of a single column, this option will result in output of one row with multiple columns instead of a single column spreadsheet. -s cell Start the top left corner of the spreadsheet in cell. For example, -s B33 will arrange the output data so that the spreadsheet starts in column B, row 33. -R n Increment by n on each new output row. -C n Increment by n on each new output column. -n n Output n rows before advancing to the next column. This option is used when the input is arranged in a single column and the spreadsheet is to have multiple columns, each of which is to be length n. -d c Use the single character c as the delimiter between input fields. -P Plain numbers only. A field is a number only when there is no imbedded [-+eE]. -S All numbers are strings. -v Print the version of psc SEE ALSO
sc(1) AUTHOR
Robert Bond PSC 7.16 19 September 2002 PSC(1)
All times are GMT -4. The time now is 06:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy