Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Matching position and output neighbors within 500 distant Post 302884169 by fat on Saturday 18th of January 2014 03:12:38 AM
Old 01-18-2014
Quote:
Originally Posted by migurus
awk based solution
Code:
 
 $ cat test.sh
 awk '
{
        if (NR == FNR)  ### reading 1st file
                        ### accumulate 2nd column values in array of key values
        {
                arr[NR] = $2;
        }
        else            ### reading 2nd file and check for value of 2nd column
                        ### to be within +/- 500 of any accumulated key values
        {
                maxval = $2 + 500;
                minval = $2 - 500;
                for ( x in arr )
                {
                        if(minval <= arr[x] && arr[x] <= maxval)
                                print $0;
                }
         }
}
' $1 $2

Here is how I ran it
Code:
 
 $ cat a
1       11567687        snpid20
1       153881  snpid1
2       56768799        snpid7
3       3156760 snpid4
3       1567687 snpid7
$ cat b
1       11567600        snpid20
3       1000000 snpid7
 $ test.sh b a
 1       11567687        snpid20

hope this is what you were looking for

Thanks
This almost works except that it outputs all +/- 500 range of the keys. for example when searching for +/- 500 of the key 11567687 in "1 11567687 snpid20" it should output all values +/- 500 from second file that have their column 1 as 1, when searching for +/- 500 of key 3156760 "3 3156760 snpid4" it should output all values +/- 500 from second file that have their column 1 as 3.
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

View file on distant machine

Hello everybody, I have a program that connects to a distant machine using a specific port. Then inetd executes a command on that distant machine (M2). What I'd like to do is write a scipt that, given the port, it gives me the command executed. (The script should be launched on the local... (5 Replies)
Discussion started by: Majid
5 Replies

2. Shell Programming and Scripting

Cut output to same byte position

Hi folks I have a file with thousands of lines with fixed length fields: sample (assume x is a blank space) 111333xx444TTTLKOPxxxxxxxxx I need to make a copy of this file but with only some of the field positions, for example I'd like to copy the sample to the follwing: so I'd like to... (13 Replies)
Discussion started by: HealthyGuy
13 Replies

3. Shell Programming and Scripting

Fill the values between -500 to 500 -awk

input -200 2.4 0 2.6 30 2.8 output -500 0 -499 0 -488 0 .......... .......... .... -200 2.4 .... ... 0 2.6 (6 Replies)
Discussion started by: quincyjones
6 Replies

4. Shell Programming and Scripting

loop with OK or NOK output at the same position

Hi This is my script $ cat ./openldap_test.sh #!/bin/bash for ldap_srv in 'testserver1' 'server2' 'server3' 'server4' 'testserver5' 'server6' 'server7' 'server8' 'server9' 'testserver10'; do ldapsearch -LLL -x -H ldap://$ldap_srv '(cn=examplebox)' memberNisNetgroup > /dev/null if ; then... (1 Reply)
Discussion started by: slashdotweenie
1 Replies

5. Shell Programming and Scripting

Find the position of lines matching string

I have a file with the below format, GS*8***** ST*1******** A* B* E* RMR*123455(This is the unique number to locate this row) F* SE*1*** GE** GS*9***** ST*2 H* J* RMR*567889(This is the unique number to locate this row) L* SE* GE***** (16 Replies)
Discussion started by: Muthuraj K
16 Replies

6. UNIX for Dummies Questions & Answers

Help with finding matching position on strings

I have a DNA file like below and I am able to write a short program which finds/not an input motif, but I dont understand how I can include in the code to report which position the motif was found. Example I want to find the first or all "GAT" motifs and want the program to report which position... (12 Replies)
Discussion started by: pawannoel
12 Replies

7. UNIX for Dummies Questions & Answers

Process on distant server

Hello, I have a question regarding how to manage a process on a distant unix server. I perform calculations on a dedicated Unix server (RedHat ELS5.5) using Matlab (installed on the server). The commands are written in a terminal session (via ssh) on my laptop (MacBook Pro6,2 - MacOS X 10.6.7).... (1 Reply)
Discussion started by: antonino_ch
1 Replies

8. Shell Programming and Scripting

Shell script to retrieve first degree neighbors

I have a file with two columns and each pair in the rows denote 2 connected nodes in the network file, edge_list.txt. Given a query file, input.txt, I want to retrieve the nodes that are directly connected (first degree neighbors) to the nodes present in the input.txt. Kindly help. ... (3 Replies)
Discussion started by: Sanchari
3 Replies

9. Shell Programming and Scripting

awk usage for position matching

i have a requirement like this if the line contains from position 294 to 299 is equal to "prabhu" ,then print entire line . i want to use awk awk '{if(substr(294-299) == 'prabhu') print "line" }' filename (1 Reply)
Discussion started by: ptappeta
1 Replies

10. UNIX for Dummies Questions & Answers

String pattern matching and position

I am not an expert with linux, but following various posts on this forum, I have been trying to write a script to match pattern of charters occurring together in a file. My file has approximately 200 million characters (upper and lower case), with about 50 characters per line. I have merged all... (5 Replies)
Discussion started by: biowizz
5 Replies
expand(1)							   User Commands							 expand(1)

NAME
expand, unexpand - expand TAB characters to SPACE characters, and vice versa SYNOPSIS
expand [-t tablist] [file...] expand [-tabstop] [ -tab1, tab2,. . ., tabn] [file...] unexpand [-a] [-t tablist] [file...] DESCRIPTION
The expand utility copies files (or the standard input) to the standard output, with TAB characters expanded to SPACE characters. BACKSPACE characters are preserved into the output and decrement the column count for TAB calculations. expand is useful for pre-processing character files (before sorting, looking at specific columns, and so forth) that contain TAB characters. unexpand copies files (or the standard input) to the standard output, putting TAB characters back into the data. By default, only leading SPACE and TAB characters are converted to strings of tabs, but this can be overridden by the -a option (see the OPTIONS section below). OPTIONS
The following options are supported for expand: -t tablist Specifies the tab stops. The argument tablist must consist of a single positive decimal integer or multiple posi- tive decimal integers, separated by blank characters or commas, in ascending order. If a single number is given, tabs will be set tablist column positions apart instead of the default 8. If multiple numbers are given, the tabs will be set at those specific column positions. Each tab-stop position N must be an integer value greater than zero, and the list must be in strictly ascending order. This is taken to mean that, from the start of a line of output, tabbing to position N causes the next char- acter output to be in the (N+1)th column position on that line. In the event of expand having to process a tab character at a position beyond the last of those specified in a mul- tiple tab-stop list, the tab character is replaced by a single space character in the output. -tabstop Specifies as a single argument, sets TAB characters tabstop SPACE characters apart instead of the default 8. -tab1,tab2,...,tabn Sets TAB characters at the columns specified by -tab1,tab2,...,tabn The following options are supported for unexpand: -a Inserts TAB characters when replacing a run of two or more SPACE characters would produce a smaller output file. -t tablist Specifies the tab stops. The option-argument tablist must be a single argument consisting of a single positive decimal integer or multiple positive decimal integers, separated by blank characters or commas, in ascending order. If a single number is given, tabs will be set tablist column positions apart instead of the default 8. If multiple numbers are given, the tabs will be set at those specific column positions. Each tab-stop position N must be an integer value greater than zero, and the list must be in strictly ascending order. This is taken to mean that, from the start of a line of output, tabbing to position N will cause the next character output to be in the (N+1)th column position on that line. When the -t option is not specified, the default is the equivalent of specifying -t 8 (except for the interaction with -a, described below). No space-to-tab character conversions occur for characters at positions beyond the last of those specified in a multiple tab-stop list. When -t is specified, the presence or absence of the -a option is ignored; conversion will not be limited to the processing of leading blank characters. OPERANDS
The following ooperand is supported for expand and unexpand: file The path name of a text file to be used as input. ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of expand and unexpand: LANG, LC_ALL, LC_CTYPE, LC_MESSAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
tabs(1), attributes(5), environ(5), standards(5) SunOS 5.10 1 Feb 1995 expand(1)
All times are GMT -4. The time now is 08:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy