Sponsored Content
Top Forums Shell Programming and Scripting Computing average values from multiple text files Post 302543104 by rbredereck on Friday 29th of July 2011 01:18:02 PM
Old 07-29-2011
Computing average values from multiple text files

Hi,
first, I have searched in the forum for this, but I could not find the right answer. (There were some similar threads, but I was not sure how to adapt the ideas.)

Anyway, I have a quite natural problem: Given are several text files. All files contain the same number of lines and the same number of columns. I want to compute a file that contains at each cell (interpreting the files as tables) the average value of the corresponding cells from my files.

There is a problematic thing in the files: The cells may contain numbers or the string "n/a" which means something like "not computed". When some files have a "n/a" at some cell, then the result file should contain the average from the not-"n/a" values and (separated by some unique symbol like "~") the number of "n/a" values.

For example:
File 1
Code:
1 2   3
1 n/a 3

File 2
Code:
3 2   1
3 n/a 1

File 3
Code:
2 2 n/a
5 2 5

Now, want to compute the following:

Resultfile
Code:
2 2   2~1
3 2~2 3

Of course, I could implement this with some high-level programming language, but having this as script would make it much more comfortable in my application.

I think this should be easy for experts of awk or similar tools. Unfortunately I don't see an easy solution.

Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Average of elements throught multiple files

Hi, I got a lot of files looking like this: 1 0.5 6 All together there are ard 1'000'000 lines in each of the ard 100 files. I want to build the average for every line, and write the result to a new file. The averaging should start at a specific line, here for example at line... (10 Replies)
Discussion started by: chillmaster
10 Replies

2. Shell Programming and Scripting

Help in extracting multiple files and taking average at same time

Hi, I have 20 files which have respective 50 lines with different values. I would like to process each line of the 50 lines in these 20 files one at a time and do an average of 3rd field ($3) of these 20 files. This will be output to an output file. Instead of using join to generate whole... (8 Replies)
Discussion started by: ahjiefreak
8 Replies

3. UNIX Desktop Questions & Answers

How do you [e]grep for multiple values within multiple files?

Hi I'm sure there's a way to do this, but I ran out of caffeine/talent before getting the answer in a long winded alternate way (don't ask ;) ) The task I was trying to do was scan a directory of files and show only files that contained 3 values: I940 5433309 2181 I tried many variations... (4 Replies)
Discussion started by: callumw
4 Replies

4. Shell Programming and Scripting

Computing average and standard deviation from multiple text files

Hello there, I found an elegant solution to computing average values from multiple text files awk '{for (i=1;i<=NF;i++){if ($i!~"n/a"){a+=$i}else{b++}}}END{for (i=1;i<=FNR;i++){for (j=1;j<=NF;j++){printf (a/(3-b))((b>0)?"~"b" ":" ")};printf "\n"}}' file1 file2 file3 I tried to modify... (2 Replies)
Discussion started by: charmmilein
2 Replies

5. Shell Programming and Scripting

Average of a column in multiple files

I have several sequential files with name stat.1000, stat.1001....to stat.1020 with a format like this 0.01 1 3822 4.97379915032e-14 4.96982253992e-09 0 0.01 3822 1 4.97379915032e-14 4.96982253992e-09 0 0.01 2 502 0.00993165137406 993.165137406 0 0.01 502 2 0.00993165137406 993.165137406 0... (6 Replies)
Discussion started by: kayak
6 Replies

6. UNIX for Dummies Questions & Answers

Computing for linearly-interpolated values using awk

Hi, I want to compute for linearly-interpolated values for my data using awk, any help is highly appreciated. How do I apply the linear interpolation formula to my data in awk given the equation below: x y 15 0 25 0.1633611 35 0.0741623 desired output: linear interpolation at... (4 Replies)
Discussion started by: ida1215
4 Replies

7. Shell Programming and Scripting

Read record from the text file contain multiple separated values & assign those values to variables

I have a file containing multiple values, some of them are pipe separated which are to be read as separate values and some of them are single value all are these need to store in variables. I need to read this file which is an input to my script Config.txt file name, first path, second... (7 Replies)
Discussion started by: ketanraut
7 Replies

8. Shell Programming and Scripting

Match first two columns and average third from multiple files

I have the following format of input from multiple files File 1 24.01 -81.01 1.0 24.02 -81.02 5.0 24.03 -81.03 0.0 File 2 24.01 -81.01 2.0 24.02 -81.02 -5.0 24.03 -81.03 10.0 I need to scan through the files and when the first 2 columns match I... (18 Replies)
Discussion started by: ncwxpanther
18 Replies

9. Shell Programming and Scripting

Extracting values based on line-column numbers from multiple text files

Dear All, I have to solve the following problems with multiple tab-separated text file but I don't know how. Any help would be greatly appreciated. I have access to Linux mint (but not as a professional). I have multiple tab-delimited files with the following structure: file1: 1 44 2 ... (5 Replies)
Discussion started by: Bastami
5 Replies

10. UNIX for Beginners Questions & Answers

awk GSUB read field values from multiple text files

My program run without error. The problem I am having. The program isn't outputting field values with the column headers to file.txt. Each of the column headers in file.txt has no data. MEMSIZE SECOND SASFoundation Filename The output results in file.txt should show: ... (1 Reply)
Discussion started by: dellanicholson
1 Replies
mcx diameter(1) 						  USER COMMANDS 						   mcx diameter(1)

  NAME
      mcx diameter - compute the diameter of a graph

  SYNOPSIS
      mcx diameter [options]

      mcxdiameter  is  not  in actual fact a program. This manual page documents the behaviour and options of the mcx program when invoked in mode
      diameter. The options -h, --apropos, --version, -set, --nop, -progress <num> are accessible in all mcx modes. They are described in the  mcx
      manual page.

      mcx  diameter  [-abc <fname> (specify label input)] [-imx <fname> (specify matrix input)] [-o <fname> (output file name)] [-tab <fname> (use
      tab file)] [-t <int> (use <int> threads)] [-J <intJ> (a total of <intJ> jobs are used)] [-j <intj> (this job has index  <intj>)]	[--summary
      (output diameter and average shortest path length)] [--list (list eccentricity for all nodes)] [-h (print synopsis, exit)] [--apropos (print
      synopsis, exit)] [--version (print version, exit)]

  DESCRIPTION
      mcx diameter computes the diameter of a graph. The input graph should be symmetric. Results will be unpredictable for directed graphs.   For
      label input this is irrelevant as mcx diameter will create a symmetric graph from the input.

      The  input  graph/matrix,  if specified with the -imx option, has to be in mcl matrix/graph format. You can use label input instead by using
      the -abc option.	Refer to mcxio(5) for a description of these two input formats.  By default mcx diameter  reads  from  STDIN  and  expects
      matrix format.  To specify label input from STDIN use -abc -.

  OPTIONS
      -abc <fname> (label input)
	The file name for input that is in label format.

      -imx <fname> (input matrix)
	The file name for input that is in mcl native matrix format.

      -o <fname> (output file name)
	The name of the file to write output to.

      -tab <fname> (use tab file)
	This  option causes the output to be printed with the labels found in the tab file.  With -abc this option will, additionally, construct a
	graph only on the labels found in the tab file.  If this option is used in conjunction with -imx the tab domain and the matrix domain  are
	required to be identical.

      -t <int> (use <int> threads)
      -J <intJ> (a total of <intJ> jobs are used)
      -j <intj> (this job has index <intj>)
	Computing  the diameter of a graph is time-intensive.  If you have multiple CPUs available consider using as many threads. Additionally it
	is possible to spread the computation over multiple jobs/machines.  Conceptually, each job takes a number of threads from the total thread
	pool.  If job control is used (the -J option is used) then the number of jobs should not exceed the number of threads. The total number of
	threads divided by the total number of jobs defines the number of threads that will be used by the current job.  Additionally, the  number
	of threads specified signifies the total added amount of all threads across all machines and must be the same for all jobs. This number is
	used by each job to infer its own set of tasks.  The following set of options, if given to as many commands, defines three jobs, each run-
	ning four threads.

	-t 12 -G 3 -g 0
	-t 12 -G 3 -g 1
	-t 12 -G 3 -g 2

      --list (list eccentricity for all nodes)
      --summary (output diameter and average eccentricity)
	The  default  mode  is --list, which results in output of the eccentricity of all nodes. The eccentricity of a node is the distance to any
	node that is the furthest away from it. The diameter of a graph is the maximum of the eccentricity taken over all nodes in  a  graph.	In
	this mode mcx diameter will not output the diameter itself. Use --summary to output just the diameter and the average eccentricity.

  SEE ALSO
      mcxio(5), and mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.

  mcx diameter 12-068						      8 Mar 2012						     mcx diameter(1)
All times are GMT -4. The time now is 08:42 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy