Sponsored Content
Top Forums UNIX for Dummies Questions & Answers How to match 2 columns where one column has data as a range - extended Post 302556575 by underscore on Monday 19th of September 2011 06:18:29 AM
Old 09-19-2011
How to match 2 columns where one column has data as a range - extended

Dear all,

there is a nice solution for a text merge where the second file has only variables with a numeric range ( sorry, cannot post URL + thread is closed ). The real world is however more complicated than in the earlier example.

file1
Code:
A 1
A 2
A 3
B 1
B 2
B 3
B 4
C 1
C 2
C 3
C 4

file 2
Code:
A Gene1 1 2
A Gene2 3 4 
A Gene3 5 6
B Gene4 1 2
C Gene5 3 4

output file required
Code:
A 1 Gene1
A 2 Gene1
A 3 Gene2
B 1 Gene4
B 2 Gene4
B 3 -
B 4 -
C 1 -
C 2 -
C 3 Gene5
C 4 Gene5

The earlier code

Code:
awk 'NR==FNR{a[$1]=$2;b[$1]=$3;c[$1]=$4;next}
a[$1] && $2 >= b[$1] && $2 <= c[$1]{print $0 FS a[$1];next}
{print $0 FS " -"}' file2 file1

doesn't consider multiple occurrences of A. Can you help with an update?

Last edited by underscore; 09-19-2011 at 07:20 AM.. Reason: mixed code and quote
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

two files.say a and b.both have long columns.i wanna match the column fron 1st file w

ex: a file has : 122323 123456456 125656879 678989965t635 234323432 b has : this is finance no. this is phone no this is extn ajkdgag idjsidj i want the o/p as: 122323 his is finance no. 123456456 this is phone no 123456456 ... (4 Replies)
Discussion started by: TRUPTI
4 Replies

2. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

3. UNIX for Dummies Questions & Answers

How to match 2 columns where one column has data as a range

Hi, I have a query about joining files using data ranges. Example files below - I want to join file1 to file2 with matches where file1 column 1 is equal to file2 column1, and file1 column 2 is within the range of file2 columns 3 and 4. I would like rows which don't match to be printed too. ... (4 Replies)
Discussion started by: auburn
4 Replies

4. Shell Programming and Scripting

awk to match a numeric range specified by two columns

Hi Everyone, Here's a snippet of my data: File 1 = testRef2: A1BG - 13208 13284 AAA1 - 34758475 34873943 AAAS - 53701240 53715412File 2 = 42MLN.3.bedS2: 13208 13208 13360 13363 13484 13518 13518My awk script: awk 'NR == FNR{a=$1;next} {$1>=a}{$1<=a}{print... (5 Replies)
Discussion started by: heecha
5 Replies

5. Shell Programming and Scripting

Splitting the data in a column into several columns

Hi, I have the following input file 32895901-d17f-414c-ac93-3e7e0f5ec240 AND @GDF_INPUT 73b129e1-1fa9-4c0d-b95b-4682e5389612 AUS @GDF_INPUT 40f82e88-d1ff-4ce2-9b8e-d827ddb39447 BEL @GDF_INPUT 36e9c3f1-042a-43a4-a80e-4a3bc2513d01 BGR @GDF_INPUT I want to split column 3 into two columns:... (1 Reply)
Discussion started by: ramky79
1 Replies

6. Shell Programming and Scripting

Match same file column data

File A B07 U51C 4434 L662C 4412 B07 L64U 612 L651B 4434 B07 L11C 4434 R151B 4434 B05 L12Z 612 L51B 4434 B01 651Z 612 L651C 4434 B04 A51Z 612 L51A 4434 L07 B08D 4434 B1B 4434 B07 RU8D 4434 L51A 4434 B07 L58D 4434 B51C 4434 B07 LA8D 4434 L4B 4434 Now i want File B Output B07... (2 Replies)
Discussion started by: asavaliya
2 Replies

7. Shell Programming and Scripting

Match words and fetch data in front of it in second column

Hi all, I have 2 files one file contain data like this in one column AST3 GSTY4 JST3 second file containign data like this in 2 columns AST3(PAXXX),GSTY4(PAXXY) it is used in diabetes KST4 it is used in blood... (6 Replies)
Discussion started by: manigrover
6 Replies

8. Shell Programming and Scripting

Compare 2 files and match column data and align data from 3 column

Hello experts, Please help me in achieving this in an easier way possible. I have 2 csv files with following data: File1 08/23/2012 12:35:47,JOB_5330 08/23/2012 12:35:47,JOB_5330 08/23/2012 12:36:09,JOB_5340 08/23/2012 12:36:14,JOB_5340 08/23/2012 12:36:22,JOB_5350 08/23/2012... (5 Replies)
Discussion started by: asnandhakumar
5 Replies

9. Shell Programming and Scripting

Match first two columns and calculate percent of average in third column

I have the need to match the first two columns and when they match, calculate the percent of average for the third columns. The following awk script does not give me the expected results. awk 'NR==FNR {T=$3; next} $1,$2 in T {P=T/$3*100; printf "%s %s %.0f\n", $1, $2, (P>=0)?P:-P}' diff.file... (1 Reply)
Discussion started by: ncwxpanther
1 Replies

10. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies
AND.PRIORITIES(5)                                                  File Formats                                                  AND.PRIORITIES(5)

NAME
/etc/and.priorities - priority database for the auto nice daemon. VERSION
This manual page documents and.priorities for and version 1.2.2. DESCRIPTION
This is the priority database file for and. It stores (user, group, command, parent, nicelevels) tuples (hereafter called entries) to determine the new nice level (or the kill signal, for that matter) when a job reaches one of the time limits defined in /etc/and.conf. (See lv1time, lv2time, and lv3time on the and.conf manual page for details.) See the affinity setting in /etc/and.conf for how ambiguities between the fields (user, group, command, parent) are dealt with when searching the database to determine the new nice level for a job. Note that if more than one entry matches with the same accuracy (e.g. with a parent= entry and an ancestor= entry), the last entry wins! Comments start with a # in the first column. Empty lines are ignored. Unlike with other configuration files, lines cannot be concatenated with a backslash. Furthermore, this file is case sensitive. and allows for host-specific sections in the configuration file. These work as lines of the form on somehost and work as follows: the parser determines if the host name (as returned by gethostname) matches the extended regular expression that follows the on keyword. If it does, it just keeps processing the file as if nothing had happened. If it does not match, however, everything up to the next on keyword is skipped. So if you want to end a host-specific section, you must write on .* (which matches all hosts) to switch back to normal. Don't forget to kill -HUP the auto nice daemon to enable the changes. SETTINGS
A valid entry consists of a line of six columns, separated by one or more spaces. These columns are: (in that order) user The user ID the command is running under. May be a user name (which will be looked up in the password file and, if enabled, via NIS), or a numeric user ID, or an asterisk for any user. group The group ID the command is running under. May be a group name (which will be looked up in the group file and again, if enabled, via NIS), or a numeric group ID, or an asterisk for any group. command The name of the command, without path. May be a command, a regular expression to match multiple commands, or an asterisk for any com- mand. Note that "foobar" will not match "/usr/bin/foobar" - you probably mean ".*foobar" or even ".*foobar.*". parent There are two modes of operation for the parent field, determined by a keyword: parent=foobar will match if a process' direct parent process matches the command or regular expression after the equal sign, whereas ancestor=foobar will match if any ancestor process matches. After the keyword and the equal sign goes the name of the parent process, without path. May be a command, a regular expres- sion to match multiple commands, or an asterisk for any command. (You can just use the asterisk if you want to ignore parents for this entry.) Note that again "foobar" will not match "/usr/bin/foobar", as with command. nicelevel 1 The nice level after lv1time CPU time was used by the command. Positive numbers and 0 are interpreted as nice levels; negative numbers are interpreted as signals to be sent to the command. A "nice level" of 19 will almost stop the job, -9 will actually kill it. (Like in kill -9.) lv1time can be set in /etc/and.conf nicelevel 2 Same but after lv2time. nicelevel 3 Same but after lv3time. EXAMPLES
Here are some entries from the real world (i.e. from "my" cluster at the Institute). As lv[123]time, 5 min., 20 min., and 1 hour is assumed. (Which is the default. See /etc/and.conf for details.) You might also check the default priority database that comes with and. # A finer default nice level * * * * 4 8 12 # User dau is an idiot, so treat him like accordingly dau * * * 19 19 19 # Netscape sometimes goes berserk, we must stop it * * netscape * 4 -9 -9 # Most hosts are free for everyone but some are # especially for the FOO group * * * * 4 8 12 on (bar|baz) * * * * 8 12 16 # ... or, more radical: * * * * -9 -9 -9 * foo * * 4 8 12 on .* # KDE screen savers... * * .*kss * 16 16 16 # Grid jobs (assuming they are started by a master # process) * * * ancestor=grid_master 10 10 10 # Now some clever yet deceitful user might start all # his jobs using a shell script named grid_master. # He shall regret... whereas the original grid_master # (owned by grid) is left alone. * * grid_master * -9 -9 -9 grid * grid_master * 0 0 0 FILES
/etc/and.priorities The priority database (in plain text). Contains the (user, group, command, nicelevels) tuples. This is what this manual page is about. SEE ALSO
and(8), and.conf(5), kill(1), regex(7), renice(8) INTERNET
http://and.sourceforge.net/ AUTHOR
The auto nice daemon and this manual page were written by Patrick Schemitz <schemitz@users.sourceforge.net> Unix 27 Mar 2005 AND.PRIORITIES(5)
All times are GMT -4. The time now is 11:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy