sorry about not providing with a sample input, I`m using cygwin. The data range is 0 to 100,000 should be outputted upto 2 decimal places.
Data is 83000 lines, not very big.
Yes, the code calculates the average correctly for only the 7th column, although I should
populate arr as arr[$1" "$2" "$3" "$4" "$5] to get all the variables delimited.
Sample input has 4 data columns, I have many in the original data starting col7 until NF ($22).
Output should be
The code you showed us in your 1st post in this thread skips data in the 1st line of your file (which I assumed was intended to skip over a header line). But, I don't see any headers in this sample. Is there a header, or not? If there s a header, should it be copied to the output?
Is the number of fields constant in an input file, or can it vary from line to line?
It looks like there is a leading space in your sample input and output. Is a leading space required in your output?
Do you want 2 decimal places in all computed output fields, or do you want values to be printed without decimal places (as in your sample output) in cases where the computed result is an integral value?
You say you want to calculate averages for fields 7 through NF, but your sample data also calculates the average for field 6? Is field 6 supposed to be ignored in calculations and removed from the output, or is field 6 to be averaged as well as fields 7 through NF?
Hi,
I have a space delimited text file that looks like the following:
Aa 100 200
Bb 300 100
Cc X 500
Dd 600 X
Basically, I want to take the average of columns 2 and 3 and print it in column 4. However if there is an X in either column 2 or 3, I want to print the non-X value. Therefore... (11 Replies)
Hi, I need help with the awk command.
I have a folder with aprox 500 files each one with two columns and I want to print in a new file, the average of column 1 and average of column 2 and the name of each file.
Input files are:
File-1:
100 99
20 99
50 99
50 99
File-2:
200 85... (3 Replies)
My File looks like:
"|" -> Field separator
A|B|C|100|1000
D|E|F|1|2
G|H|I|0|7
D|E|F|1|2
A|B|C|10|10000
G|H|I|0|7
A|B|C|1|100
D|E|F|1|2
I need to do a SUM on Col. 5 and Col.6 by grouping on Col 1,2 & 3
My expected output is:
A|B|C|111|11100 (2 Replies)
I have the following format of input from multiple files
File 1
24.01 -81.01 1.0
24.02 -81.02 5.0
24.03 -81.03 0.0
File 2
24.01 -81.01 2.0
24.02 -81.02 -5.0
24.03 -81.03 10.0
I need to scan through the files and when the first 2 columns match I... (18 Replies)
Hi Friends,
I have files with columns like this. This sample input below is partial.
Please check below for main file link. Each file will have only two rows.
... (8 Replies)
Hi forum members,
I'm trying to get an average of multiple columns in a csv file using awk. A small example of my input data is as follows:
cu,u3o8,au,ag
-9,20,-9,3.6
0.005,30,-9,-9
0.005,50,10,3.44
0.021,-9,8,3.35
The following code seems to do most of what I want
gawk -F","... (6 Replies)
Dear Experts,
I have input file which is comma separated, has 4 columns like below,
BRAND,COUNTRY,MODEL,COUNT
NIKE,USA,DUMMY,5
NIKE,USA,ORIGINAL,10
PUMA,FRANCE,DUMMY,20
PUMA,FRANCE,ORIGINAL,15
ADIDAS,ITALY,DUMMY,50
ADIDAS,ITALY,ORIGINAL,50
SPIKE,CHINA,DUMMY,1O
And expected output add... (2 Replies)
hello, I have three files in the following order
==> File1 <==
1 20977000 20977000 A C 1.00 0,15 15 45
1 115829313 115829313 G A 0.500 6,7 13 99
==> File2 <==
1 20977000 20977000 A C 1.00 0,13 13 39
1 115829313 ... (5 Replies)
I have files that have the following columns
chr pos ref alt sample 1 sample 2 sample 3
chr2 179644035 G A 1,107 0,1 58,67
chr7 151945167 G T 142,101 100,200 500,700
chr13 31789169 CTT CT,C 6,37,8 0,0,0 15,46,89
chr22 ... (3 Replies)