Sponsored Content
Top Forums UNIX for Beginners Questions & Answers awk to average matching lines in file Post 303041704 by cmccabe on Monday 2nd of December 2019 09:22:44 AM
Old 12-02-2019
awk to average matching lines in file

The awk below executes and is close (producing the first 4 columns in desired). However, when I add the sum of $7, I get nothing returned. Basically, I am trying to combine all the matching $4 in f1 and output them with the average of $7 in each match. Thank you Smilie.

f1
Code:
chr9	135804143	135804145	chr9:135804143-135804145	TSC1	1	128
chr9	135804143	135804145	chr9:135804143-135804145	TSC1	2	125
chr9	135819919	135819922	chr9:135819919-135819922	TSC1	1	0
chr9	135819919	135819922	chr9:135819919-135819922	TSC1	2	0
chr9	135819919	135819922	chr9:135819919-135819922	TSC1	3	0
chr16	2097885	2097890	chr16:2097885-2097890	TSC2	1	249
chr16	2097885	2097890	chr16:2097885-2097890	TSC2	2	245
chr16	2097885	2097890	chr16:2097885-2097890	TSC2	3	243
chr16	2097885	2097890	chr16:2097885-2097890	TSC2	4	237
chr16	2097885	2097890	chr16:2097885-2097890	TSC2	5	237

desired tab-delimeted
Code:
chr9	135804143	135804145 	TSC1	126
chr9	135819919	135819922 	TSC1	0
chr16	2097885	2097890 	TSC2	242

awk
Code:
awk '
# print from stored values
function prt(){
  print p1 ":" (p6start==1 ? p2 : p2+p6start) "-" p2+p6, "\t" p5
}
($4!=p4 || $6!=p6+1) {
# new sequence, print the previous sequence
  if (NR>1) prt()
  p6start=$6  
}
{
# store the values that for later
  p1=$1
  p2=$2
  p4=$4
  p5=$5
  p6=$6
}
END { prt() }
' f1 | awk -F"[:-]" ' { print $1 "\t" $2 "\t" $3 "\t" $4}' | awk -v N=7 '{ sum += $N } END { if (NR > 0) print sum / NR }' > out

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print lines matching value(s) in other file using awk

Hi, I have two comma separated files. I would like to see field 1 value of File1 exact match in field 2 of File2. If the value matches, then it should print matched lines from File2. I have achieved the results using cut, paste and egrep -f but I would like to use awk as it is efficient way and... (7 Replies)
Discussion started by: SBC
7 Replies

2. Shell Programming and Scripting

Finding lines matching the Pattern and their previous lines in a file

Hi, I am trying to locate the occurences of certain pattern like 'Possible network disconnect' in a text file. I can get the actual lines matching the pttern using: grep -w 'Possible network disconnect' file_name. But I am more interested in getting the timing of these events which are... (7 Replies)
Discussion started by: sagarparadkar
7 Replies

3. Shell Programming and Scripting

awk print non matching lines based on column

My item was not answered on previous thread as code given did not work I wanted to print records from file2 where comparing column 1 and 16 for both files find rows where column 16 in file 1 does not match column 16 in file 2 Here was CODE give to issue ~/unix.com$ cat f1... (0 Replies)
Discussion started by: sigh2010
0 Replies

4. Emergency UNIX and Linux Support

[Solved] AWK to parse adjacent matching lines

Hi, I have an input file like F : 0.1 : 0.002 P : 0.3 : 0.004 P : 0.5 : 0.008 P : 0.1 : 0.005 L : 0.05 : 0.02 P: 0.1 : 0.006 P : 0.01 : 0.08 F : 0.02 : 0.08 Expected output: (2 Replies)
Discussion started by: vasanth.vadalur
2 Replies

5. Shell Programming and Scripting

awk file comparison, x lines after matching as output

Hello, I couldn't find anything on the Forum that would help me to solve this problem. Could any body help me process below data using awk? I have got two files: file1: Worker1: Thomas Position: Manager Department: Sales Salary: $5,000 Worker2: Jason Position: ... (5 Replies)
Discussion started by: killerbee
5 Replies

6. Shell Programming and Scripting

How to add lines of a file and average them

I'm reading in numbers from a file and trying to add them together. Here is the code so far. I know the 1+2+3.... part is wrong. The file has five numbers in it with each number on its own line. The numbers are decimals if that matters. Thanks. while read EachLine do echo $EachLine done <... (6 Replies)
Discussion started by: AxlVanDamme
6 Replies

7. Shell Programming and Scripting

Help With AWK Matching and Re-printing Lines

Hi All, I'm looking to use AWK to pattern match lines in XML file - Example patten for below sample would be /^<apple>/ The sample I wrote out is very basic compared to what I am actually working with but it will get me started I would like to keep the matched line(s) unchanged but have them... (4 Replies)
Discussion started by: rhoderidge
4 Replies

8. Shell Programming and Scripting

awk - matching on 2 columns for differents lines

Given this file (I separated them in block to make my explanation clearer): 92157768877;Sof_deme_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0 92157768877;Sof_trav_Fort_Email_am_%yyyy%%mm%%dd%;EMAIL;20/02/2015;1;0;0 91231838895;Sof_deme_faible_Email_am;EMAIL;26/01/2015;1 0;0... (1 Reply)
Discussion started by: Andy_K
1 Replies

9. Shell Programming and Scripting

awk to combine matching lines in file

I am trying to combine all matching lines in the tab-delimited using awk. The below runs but no output results. Thank you :). input chrX 110925349 110925532 ALG13 chrX 110925349 110925532 ALG13 chrX 110925349 110925532 ALG13 chrX 47433390 47433999 SYN1... (3 Replies)
Discussion started by: cmccabe
3 Replies

10. Shell Programming and Scripting

awk to average field if matching string in another

In the awk below I am trying to get the average of the sum of $7 if the string in $4 matches in the line below it. The --- in the desired out is not needed, it is just to illustrate the calculation. The awk executes and produces the current out. I am not sure why the middle line is skipped and the... (2 Replies)
Discussion started by: cmccabe
2 Replies
NUMSUM(1)						User Contributed Perl Documentation						 NUMSUM(1)

NAME
numsum - numsum program file SYNOPSIS
numsum [-iIcdhrsvxy] <FILE> | numsum [-iIcdhrsvxy] (Input on STDIN from pipeline.) numsum [-iIcdhrsvxy] (Input on STDIN. Use Ctrl-D to stop.) DESCRIPTION
numsum will take all the numbers on stdin and return the sum of those numbers. Currently it only processes the first number on each line. Besides positive numbers, it also handles negative numbers and numbers with decimals. OPTIONS
-i Only return the integer portion of the final sum. -I Only return the decimal portion of the final sum. -c Print out the sum of each column. -r Print out the sum of each row. -x <n> Specify a comma seperated list of columns to print. -y <n> Specify a comma seperated list of rows to print. -s <string> Specify a string to use as a seperator for columns. This defaults to be consecutive whitespace (s+). -h Help: You're looking at it. -V Increase verbosity. -d Debug mode. For developers -q Quiet mode, don't print any warnings. EXAMPLES
Simply add up the numbers in a file. $ numsum numbers.txt 4315 Enter your own numbers on STDIN. The last number is the answer. $ numsum 4 21 98 100 223 Use it in a command pipeline. $ ls -1s | grep .mp3 | numsum -c -x 5 72288 Add up the total byte count in a http log file. $ cat access_log | awk {'print $10'} numsum or numsum -c -x 10 access_log Add up the columns of numbers of a file. $ cat columns 1 6 11 16 21 2 7 12 17 22 3 8 13 18 23 4 9 14 19 24 5 10 15 20 25 $ numsum -c columns 15 40 65 90 115 Add up the 1st, 2nd and 5th columns only. $ numsum -c -x 1,2,5 columns 15 40 115 Add up the rows of numbers of a file. $ numsum -r columns 55 60 65 70 75 Add up the 2nd and 4th rows. $ numsum -r -y 2,4 columns 60 70 SEE ALSO
numaverage(1), numbound(1), numinterval(1), numnormalize(1), numgrep(1), numprocess(1), numrandom(1), numrange(1), numround(1) COPYRIGHT
numsum is part of the num-utils package, which is copyrighted by Suso Banderas and released under the GPL license. Please read the COPYING and LICENSE files that came with the num-utils package Developers can read the GOALS file and contact me about providing submitions or help for the project. MORE INFO
More info on numsum can be found at: http://suso.suso.org/programs/num-utils/ perl v5.10.1 2009-10-31 NUMSUM(1)
All times are GMT -4. The time now is 06:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy