Sponsored Content
Top Forums UNIX for Beginners Questions & Answers UNIX command to select the best edge values from a network file Post 303045006 by Sanchari on Tuesday 10th of March 2020 06:16:47 PM
Old 03-10-2020
UNIX command to select the best edge values from a network file

I have a tab-delimited data representing network data (undirected). Among the duplicated edges, I wanted to select those edges for which I have the higher absolute value of the log values.
I have written a code in python, but its taking a lot of time. I would be grateful if someone helps me with an awk command. Kindly note, the network is undirected, i.e. A--B and B--A are duplicate edges. My original file has a large number of columns, I have given a simplified test data

Test data



Code:
     Gene1    Gene2    Log
    AT1G01020    AT1G01010    1.682708
    AT1G01020    AT1G01010    -1.90043
    AT1G01020    AT1G01010    -1.832192
    AT1G01070    AT1G01060    -0.591932
    AT1G01070    AT1G01060    -1.204241
    AT1G01073    AT1G01070    0.790549
    AT1G01060    AT1G01070    1.214972

Expected Output

Code:
    AT1G01020    AT1G01010    -1.90043
    AT1G01070    AT1G01060    1.214972
    AT1G01073    AT1G01070    0.790549

Code:
gene_table=file1.readlines() # In the real file, j[12]=Gene1, j[13]=Gene2 and j[27]=log value
lfc=[]
for j in gene_table:
    j=j.split("\t")
    j[12]=j[12].strip()
    j[13]=j[13].strip()
    lfc=[]
    int_list=[]
    lfc.append(float(j[27]))
    int_list.append(j[0])
    dict_int={}
    for k in gene_table:
        k=k.split("\t")
        k[12]=k[12].strip()
        k[13]=k[13].strip()
        if (j[0]!=k[0]) and ((j[12]==k[12] and j[13]==k[13]) or (j[12]==k[13] and j[12]==k[13])):
            lfc.append(float(k[27]))
    dict_int=dict(zip(int_list, lfc))
    x=max(lfc, key=abs)
    #print x
    listOfKeys = [key  for (key, value) in dict_int.items() if value == x]
    print listOfKeys


Last edited by Scrutinizer; 03-11-2020 at 12:29 AM..
 

8 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to select a particular media from the printer with a UNIX command

Problem Overview: We have a scheduler that prints report on any of the network printer. Problem Statement: We need to find a UNIX command that picks up either A4, legal or letter size paper form the printer. I found out a command but it's not working on our environment. ... (3 Replies)
Discussion started by: HelpMeOUt
3 Replies

2. Shell Programming and Scripting

Select distinct values from a flat file

Hi , I have a similar problem. Please can anyone help me with a shell script or a perl. I have a flat file like this fruit country apple germany apple india banana pakistan banana saudi mango india I want to get a output like fruit country apple ... (7 Replies)
Discussion started by: smalya
7 Replies

3. Shell Programming and Scripting

Identify high values "ÿ" in a text file using Unix command

I have high values (such as ÿÿÿÿ) in a text file contained in an Unix AIX server. I need to identify all the records which are having these high values and also get the position/column number in the record structure if possible. Is there any Unix command by which this can be done to : 1.... (5 Replies)
Discussion started by: devina
5 Replies

4. Shell Programming and Scripting

Running a select script through UNIX and sending output to file

Hi, (Oracle, AIX) I have googled this and searched this forum, however I haven't had much luck with an answer and have tried several different things. Basically I have a SQL select statement which generates a whole load of UPDATE statements, I want to run the select statement via... (13 Replies)
Discussion started by: dbchud
13 Replies

5. Shell Programming and Scripting

Unix command to select first few characters and last character of a line

I have a huge file and I want to select first 10 charcters and last 2 characters of everyline and than will filter the unique line. I know, it must be easy bt I am new to unix scripting:) Ex. I have file as below and need to e3kbaird and last 2 characters. and than unique records. ... (3 Replies)
Discussion started by: Sanjeev Yadav
3 Replies

6. Shell Programming and Scripting

Comparing multiple network files (edge lists)

I want to compare 4 edge-lists to basically see if an edge is present in all 4 networks. The issue is that an edge A-B in one file can be present as B-A in another file. Example: Input 1: net1.txt A B 0.1 C D 0.65 D E 0.9 E A 0.7 Input 2: net2.txt A Z 0.1 C D 0.65 E D 0.9 E A... (1 Reply)
Discussion started by: Sanchari
1 Replies

7. Shell Programming and Scripting

UNIX command -Filter rows in fixed width file based on column values

Hi All, I am trying to select the rows in a fixed width file based on values in the columns. I want to select only the rows if column position 3-4 has the value AB I am using cut command to get the column values. Is it possible to check if cut -c3-4 = AB is true then select only that... (2 Replies)
Discussion started by: ashok.k
2 Replies

8. Shell Programming and Scripting

Split a content in a file with specific interval base on the delimited values using UNIX command

Hi All, we have a requirement to split a content in a text file every 5 rows and write in a new file . conditions: if 5th line falls between center of the statement . it should look upto after ";" files are below format: 1 UPDATE TABLE TEST1 SET VALUE ='AFDASDFAS' 2 WHERE... (3 Replies)
Discussion started by: KK230689
3 Replies
mcx erdos(1)							  USER COMMANDS 						      mcx erdos(1)

  NAME
      mcx erdos - compute shortest paths in a graph

  SYNOPSIS
      mcx erdos [options]

      mcxerdos	is  not  in  actual  fact  a program. This manual page documents the behaviour and options of the mcx program when invoked in mode
      erdos. The options -h, --apropos, --version, -set, --nop, -progress <num> are accessible in all mcx modes. They are  described  in  the  mcx
      manual page.

      mcx  erdos  [-query  <fname>  (query input stream)] [-abc <fname> (specify label input)] [-imx <fname> (specify matrix input)] [-tab <fname>
      (use tab file)] [-o <fname> (output file name)] [--is-directed (input graph  is  directed)]  [--is-undirected  (input  graph  is	directed)]
      [-write-path  <fname>  (path  matrix file)] [-write-step <fname> (step matrix file)] [-h (print synopsis, exit)] [--apropos (print synopsis,
      exit)] [--version (print version, exit)]

  DESCRIPTION
      mcx erdos computes shortest paths in graphs.  It can read a graph either in label format with -abc or in native format with -imx.  It  reads
      pairs of node indices from an input stream, and for each pair outputs a data structure describing the full set of shortest paths between the
      two nodes.  Edge weights are not taken into account, so an edge always represents a unit step size between two  nodes  irrespective  of  its
      weight. A mode to compute shortest paths while taking into account edge weights will be implemented later as mcx dijkstra.

      Note  that  the  full set of shortest paths between two nodes in a graph can be described as a directed acyclic graph (DAG), and this is how
      mcx erdos operates. It is easy to construct graphs and node pairs for which the number of shortest paths between the two nodes becomes expo-
      nential in the size of the graph, whereas the lattice description is always garantueed to map to a subset of the graph edge set.

      By  default  it is assumed that the input graph should be treated as undirected. To this end a transformation step is applied to ensure that
      the graph in memory is undirected. It is possible to compute shortest paths in directed graphs by using --is-directed, and it is possible to
      omit  the  transformation  step  by using --is-undirected.  If the latter is specified while the input graph is in native format and in fact
      directed, results will be erroneous. This could in theory be mitigated by checking that the input graph is undirected. However,  the  reason
      to  use --is-undirected is simply to increase speed of operation, whereas such a check would be equally expensive as the transformation step
      that is omitted with --is-undirected.

      The input graph/matrix, if specified with the -imx option, has to be in mcl matrix/graph format. You can use label input	instead  by  using
      the  -abc option.  Refer to mcxio(5) for a description of these two input formats.  By default mcx erdos reads from STDIN and expects matrix
      format.  To specify label input from STDIN use -abc -.

  OPTIONS
      -query <fname> (query input)
	The name for the file from which queries are read.  A query consists of two white-space separated node indices or  two	white-space  sepa-
	rated labels. Labels can only be used if either -abc or -tab is specified.

      -abc <fname> (label input)
	The file name for input that is in label format.

      -imx <fname> (input matrix)
	The file name for input that is in mcl native matrix format.

      -o <fname> (output file name)
	The name of the file to write output to.

      -tab <fname> (use tab file)
	This  option causes the output to be printed with the labels found in the tab file.  With -abc this option will, additionally, construct a
	graph only on the labels found in the tab file.  If this option is used in conjunction with -imx the tab domain and the matrix domain  are
	required to be identical.

      --is-directed (compute directed shortest paths)
	The input graph is not transformed and assumed to be directed.	Shortest paths are computed taking this into account.

      --is-undirected (skip symmetrification step)
	The  input  graph  is not transformed and assumed to be undirected.  Shortest paths are computed on the assumption that the input is undi-
	rected. Use this option only if you are sure the input is undirected and need to have faster execution.

      -write-path <fname> (path matrix file)
      -write-step <fname> (step matrix file)
	The path matrix enumerates the nodes that take part in all shortest paths. The first list contains those nodes that lie at distance  1	of
	the  source  node, the second list contains nodes lying at distance 2, and so on.  The step matrix contains all the edges that make up the
	lattice of shortest paths between the two query nodes.

  SEE ALSO
      mcxio(5), and mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.

  mcx erdos 12-068						      8 Mar 2012							mcx erdos(1)
All times are GMT -4. The time now is 01:37 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy