Visit Our UNIX and Linux User Community


UNIX command to select the best edge values from a network file


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers UNIX command to select the best edge values from a network file
# 1  
Old 03-10-2020
UNIX command to select the best edge values from a network file

I have a tab-delimited data representing network data (undirected). Among the duplicated edges, I wanted to select those edges for which I have the higher absolute value of the log values.
I have written a code in python, but its taking a lot of time. I would be grateful if someone helps me with an awk command. Kindly note, the network is undirected, i.e. A--B and B--A are duplicate edges. My original file has a large number of columns, I have given a simplified test data

Test data



Code:
     Gene1    Gene2    Log
    AT1G01020    AT1G01010    1.682708
    AT1G01020    AT1G01010    -1.90043
    AT1G01020    AT1G01010    -1.832192
    AT1G01070    AT1G01060    -0.591932
    AT1G01070    AT1G01060    -1.204241
    AT1G01073    AT1G01070    0.790549
    AT1G01060    AT1G01070    1.214972

Expected Output

Code:
    AT1G01020    AT1G01010    -1.90043
    AT1G01070    AT1G01060    1.214972
    AT1G01073    AT1G01070    0.790549

Code:
gene_table=file1.readlines() # In the real file, j[12]=Gene1, j[13]=Gene2 and j[27]=log value
lfc=[]
for j in gene_table:
    j=j.split("\t")
    j[12]=j[12].strip()
    j[13]=j[13].strip()
    lfc=[]
    int_list=[]
    lfc.append(float(j[27]))
    int_list.append(j[0])
    dict_int={}
    for k in gene_table:
        k=k.split("\t")
        k[12]=k[12].strip()
        k[13]=k[13].strip()
        if (j[0]!=k[0]) and ((j[12]==k[12] and j[13]==k[13]) or (j[12]==k[13] and j[12]==k[13])):
            lfc.append(float(k[27]))
    dict_int=dict(zip(int_list, lfc))
    x=max(lfc, key=abs)
    #print x
    listOfKeys = [key  for (key, value) in dict_int.items() if value == x]
    print listOfKeys


Last edited by Scrutinizer; 03-11-2020 at 01:29 AM..
# 2  
Old 03-11-2020
Hi, @Sanchari
Check have you an error?
Quote:
Originally Posted by Sanchari
Test data
Code:
     Gene1    Gene2    Log
    AT1G01020    AT1G01010    1.682708
    AT1G01020    AT1G01010    -1.90043
    AT1G01020    AT1G01010    -1.832192
    AT1G01070    AT1G01060    -0.591932
    AT1G01070    AT1G01060    -1.204241
    AT1G01073    AT1G01070    0.790549
    AT1G01060    AT1G01070    1.214972

Expected Output

Code:
    AT1G01020    AT1G01010    -1.90043
    AT1G01070    AT1G01060    1.214972
    AT1G01073    AT1G01070    0.790549

If you need to display and unique fields
then the result should be
Code:
AT1G01070 AT1G01060 -1.204241
AT1G01060 AT1G01070 1.214972
AT1G01020 AT1G01010 -1.90043
AT1G01073 AT1G01070 0.790549

and if don't
Code:
AT1G01070 AT1G01060 -1.204241
AT1G01020 AT1G01010 -1.90043

Is the solution suitable for you with the 'awk' tool?

--- Post updated at 17:40 ---

Code:
uniq -Dw 26 file |
awk '
NR==1 {next}
{if(abs(A[$1 FS $2]) < abs($3)) A[$1 FS $2] = $3}
END {for(i in A) print i, A[i]}
func abs(x) { return (x<0) ? x*-1 : x }'

awk '
NR==1 {next}
{if(abs(A[$1 FS $2]) < abs($3)) A[$1 FS $2] = $3}
END {for(i in A) print i, A[i]}
func abs(x) { return (x<0) ? x*-1 : x }' file

# 3  
Old 03-11-2020
how about (a bit verbose):
awk -f san.awk myInputFile, where san.awk is:
Code:
BEGIN {
  FS=OFS="\t"
  i1=1
  i2=2
  v=3
}
function abs(x)    { return x < 0 ? -x : x }

FNR>1 {
   idx=($i1 > $i2)? $i1 OFS $i2 : $i2 OFS $i1
   if (abs(a[idx])<abs($v))
      a[idx]=$v
}
END {
  for (i in a)
    print i,a[i]
}

results in:
Code:
AT1G01070       AT1G01060       1.214972
AT1G01020       AT1G01010       -1.90043
AT1G01073       AT1G01070       0.790549


Last edited by vgersh99; 03-11-2020 at 12:02 PM..
This User Gave Thanks to vgersh99 For This Post:

Previous Thread | Next Thread
Test Your Knowledge in Computers #273
Difficulty: Easy
In December 1939, Turing solved the essential part of the naval indicator system, which was easier and less complex than the indicator systems used by the other services.
True or False?

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a content in a file with specific interval base on the delimited values using UNIX command

Hi All, we have a requirement to split a content in a text file every 5 rows and write in a new file . conditions: if 5th line falls between center of the statement . it should look upto after ";" files are below format: 1 UPDATE TABLE TEST1 SET VALUE ='AFDASDFAS' 2 WHERE... (3 Replies)
Discussion started by: KK230689
3 Replies

2. Shell Programming and Scripting

UNIX command -Filter rows in fixed width file based on column values

Hi All, I am trying to select the rows in a fixed width file based on values in the columns. I want to select only the rows if column position 3-4 has the value AB I am using cut command to get the column values. Is it possible to check if cut -c3-4 = AB is true then select only that... (2 Replies)
Discussion started by: ashok.k
2 Replies

3. Shell Programming and Scripting

Comparing multiple network files (edge lists)

I want to compare 4 edge-lists to basically see if an edge is present in all 4 networks. The issue is that an edge A-B in one file can be present as B-A in another file. Example: Input 1: net1.txt A B 0.1 C D 0.65 D E 0.9 E A 0.7 Input 2: net2.txt A Z 0.1 C D 0.65 E D 0.9 E A... (1 Reply)
Discussion started by: Sanchari
1 Replies

4. Shell Programming and Scripting

Unix command to select first few characters and last character of a line

I have a huge file and I want to select first 10 charcters and last 2 characters of everyline and than will filter the unique line. I know, it must be easy bt I am new to unix scripting:) Ex. I have file as below and need to e3kbaird and last 2 characters. and than unique records. ... (3 Replies)
Discussion started by: Sanjeev Yadav
3 Replies

5. Shell Programming and Scripting

Running a select script through UNIX and sending output to file

Hi, (Oracle, AIX) I have googled this and searched this forum, however I haven't had much luck with an answer and have tried several different things. Basically I have a SQL select statement which generates a whole load of UPDATE statements, I want to run the select statement via... (13 Replies)
Discussion started by: dbchud
13 Replies

6. Shell Programming and Scripting

Identify high values "" in a text file using Unix command

I have high values (such as ) in a text file contained in an Unix AIX server. I need to identify all the records which are having these high values and also get the position/column number in the record structure if possible. Is there any Unix command by which this can be done to : 1.... (5 Replies)
Discussion started by: devina
5 Replies

7. Shell Programming and Scripting

Select distinct values from a flat file

Hi , I have a similar problem. Please can anyone help me with a shell script or a perl. I have a flat file like this fruit country apple germany apple india banana pakistan banana saudi mango india I want to get a output like fruit country apple ... (7 Replies)
Discussion started by: smalya
7 Replies

8. UNIX for Dummies Questions & Answers

How to select a particular media from the printer with a UNIX command

Problem Overview: We have a scheduler that prints report on any of the network printer. Problem Statement: We need to find a UNIX command that picks up either A4, legal or letter size paper form the printer. I found out a command but it's not working on our environment. ... (3 Replies)
Discussion started by: HelpMeOUt
3 Replies

Featured Tech Videos