Sponsored Content
Top Forums UNIX for Dummies Questions & Answers help! script to select line with greatest value 2 between columns Post 302566702 by wolf_blue on Thursday 20th of October 2011 11:59:10 PM
Old 10-21-2011
not working

I'm still getting a file with duplicate genes.

when I enter code:
Code:
nawk 'NR<2{next}{c=($NF-$(NF-1))}!($1 in A)||(c>m[$1]&&($1 in A)){m[$1]=c;A[$1]=$0 FS m[$1]}END{for(i in A) print A[i]}' yourfile

for a file like this:
Code:
gene   accession    chr   chr_st   begin   end   length
NN1    NC_024540  chr3    -       14000 14020  20
NN1    NC_024543  chr3    -       14050 14060  10
ATG    NC_01        chr12  +       12000 12100  100
ATG    NC_02        chr12  +       12100 12300  200

I end up getting genes that are written into the file as duplicates.
When I just want the longest gene with the greatest length to be written to one file.
I'm really grateful for your help.

Last edited by Franklin52; 10-21-2011 at 09:21 AM.. Reason: Please use code tags, thank you
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Select and display sum depending upon even columns

i have a input as : 2898 | homy | pune | 7/4/09 1 :6298 | anna | chennai | 7/4/08 2 :3728 | gonna | kol | 8/2/10 3 :3987 | hogja | mumbai | 8/5/09 4 :6187 | galma | london | 9/5/01 5 :9167 | tamina | ny | 8/3/10 6 :3981 | dastan | bagh | 8/2/07 7 :4617 | vazir | ny now,i want to get... (2 Replies)
Discussion started by: adityamitra
2 Replies

2. Shell Programming and Scripting

Select and display sum depending upon even columns

Select and display sum depending upon even columns i have a input as : 2898 | homy | pune | 7/4/09 1 :6298 | anna | chennai | 7/4/08 2 :3728 | gonna | kol | 8/2/10 3 :3987 | hogja | mumbai | 8/5/09 4 :6187 | galma | london | 9/5/01 5 :9167 | tamina | ny | 8/3/10 6 :3981 | dastan | bagh |... (1 Reply)
Discussion started by: adityamitra
1 Replies

3. Shell Programming and Scripting

[Solved] Select the columns which have value greater than particular number

i have a file of the form 9488 14392 1 1.8586e-07 5702 7729 1 1.8586e-07 9048 14018 1 1.8586e-07 5992 12556 1 1.8586e-07 9488 14393 1 1.8586e-07 9048 14019 1 1.8586e-07 5992 12557 1 1.8586e-07 9488 14394 ... (1 Reply)
Discussion started by: vaibhavkorde
1 Replies

4. Shell Programming and Scripting

Select columns from a matrix given within a range in BASH

I have a huge matrix file which looks like this (example matrix): 1 2 3 5 4 5 6 7 7 6 8 9 1 2 4 2 7 6 5 1 3 2 1 9 As one can see, this matrix has 4 columns and 6 rows. But my original matrix has some 3 million rows and 6000 columns. For example, on this matrix I can define my task as... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

5. Shell Programming and Scripting

Select lines where at least x columns above threshold value

I have a file with 20 columns. I'd like to retain only the lines for which the values in at least x columns, looking only at columns 6-20, are above a threshold. For example, I'd like to retain only the lines in the file below that have at least 8 columns (again, looking only at columns 6-20)... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

6. Shell Programming and Scripting

Take greatest value from second column

Dear All, Please help me, I have file input like this, 1 2142 215 2162 217 2842 285 2862 287 4002 401 4022 403 4822 1 2142 215 2162 217 2842 285 2862 287 4002 401 4022 403 4882 1 4801 (8 Replies)
Discussion started by: attila
8 Replies

7. Shell Programming and Scripting

Comparing Select Columns from two CSV files in UNIX and create a third file based on comparision

Hi , I want to compare first 3 columns of File A and File B and create a new file File C which will have all rows from File B and will include rows that are present in File A and not in File B based on First 3 column comparison. Thanks in advance for your help. File A A,B,C,45,46... (2 Replies)
Discussion started by: ady_koolz
2 Replies

8. Shell Programming and Scripting

Select all the even columns from a file

Hi, I can select all the even columns from a file like this: awk '{ for (i=1;i<=NF;i+=2) $i="" }1' file > new file How can I select the 1st and all the even columns using awk? Thanks! (1 Reply)
Discussion started by: forU
1 Replies

9. Shell Programming and Scripting

How do I select certain columns with matching pattern and rest of the lines?

I want to select 2nd, 3rd columns if line has "key3" and print rest of the lines as is. # This is my sample input key1="val1" key2="val2" key3="val3" key4="val4" some text some text some text some text key1="val1" key2="val2" key3="val3" key4="val4" some text some text some text some... (3 Replies)
Discussion started by: kchinnam
3 Replies

10. UNIX for Beginners Questions & Answers

How to select rows that have opposite values (A vs B, or B vs A) on first two columns?

I have a dateset like this: Gly1 Gly2 2 1 0 Gly3 Gly4 3 4 5 Gly3 Gly5 1 3 2 Gly2 Gly1 3 6 2 Gly4 Gly3 2 2 1 Gly6 Gly4 4 2 1what I expected is: Gly1 Gly2 2 1 0 Gly2 Gly1 3 6 2 Gly3 Gly4 3 4 5 Gly4 Gly3 2 2 1 A vs B, or B vs A are the same... (7 Replies)
Discussion started by: nengcheng
7 Replies
PGASetRealInitRange(2)						      PGAPack						    PGASetRealInitRange(2)

NAME
PGASetRealInitRange - sets the upper and lower bounds for randomly initializing real-valued genes. DESCRIPTION
For each gene these bounds define an interval from which the initial allele value is selected uniformly randomly. The user specifies two arrays containing lower and bound for each gene to define the interval. This is the default strategy for initializing real-valued strings. The default interval is $[0,1.0]$ for each gene. INPUT PARAMETERS
ctx - context variable min - array containing the lower bound of the interval for each gene mac - array containing the upper bound of the interval for each gene OUTPUT PARAMETERS
none SYNOPSIS
#include "pgapack.h" void PGASetRealInitRange(ctx, min, max) PGAContext *ctx double *min double *max LOCATION
real.c EXAMPLE
Set the initialization routines to select a value for each real-valued gene i uniformly randomly from the interval [-10.,i] Assumes all strings are of the same length. PGAContext *ctx; double *low, *high; int i, stringlen; : stringlen = PGAGetStringLength(ctx); low = (double *) malloc(stringlen*sizeof(double)); high = (double *) malloc(stringlen*sizeof(double)); for(i=0;i<stringlen;i++) { low[i] = -10.0; high[i] = i; } PGASetRealInitRange(ctx, low, high); 05/01/95 PGASetRealInitRange(2)
All times are GMT -4. The time now is 07:34 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy