Hi all, I'm looking for an awk solution for taking bins of data set.
For example, if I have two columns of data that I wish to use for a scatter plot, and it contains 5 million lines, how can I take averages of every 100 points, 1000, 10000 etc...
The idea is to take bins of the 5,000,000 points and reduce the density.
input
output example 1 (using bins of 3 - average every third point)
output example 2 (using bins of 2 - average every second point)
Not sure what the best tool is!
Thanks for the response, here is an example, i'll focus on just the second column
input
The purpose is to reduce the data for an x,y scatterplot, because the file is millions of lines long. Instead of plotting every point, I want to take an average of every "n" number of points, and plot that one number. Bin might not be the correct word, perhaps a "rolling-average"? For example a bin of 3 would break the data down like so:
Output would then be:
For the case where bin is 2
Finally, doing this for both the x and y axis (the original file), for bin of 3:
Input:
Many thanks, I hope this is more clear
Torch
I would like to create bins to get histogram with totals and percentage, e.g. starting from 0.
If possible to set the minimum and maximum value in the bins ( in my case value min=0 and max=20 )
Input file
8 5
10 1
11 4
12 4
12 4
13 5
16 7
18 9
16 9
17 7
18 5
19 5
20 1
21 7 (10 Replies)
I wish to use AWK to do something akin: Select all 2D data with 1<$1<2 and -7.5<$2<-6.5
But it's not working
awk 'END {print ($1<=2&&$1>=1&&$2<=-6.5&&$2>=-7.5)}' bla
Data:
-1.06897 -8.04482 -61.469
-1.13613 -8.04482 -61.2271
-1.00182 -8.04482 -61.2081
-1.06897 -8.13518 -60.8544... (2 Replies)
# more minusf.awk
#!/bin/awk -f
BEGIN {
FS=":";
}
{
if ( $2 == "" ) {
print $1 ": no password!";
}
}
# ./minusf.awk aa aa aa aa
awk: can't open aa (6 Replies)
Hi, i want to print 2nd column value with the below script. I need to take input of the string i need to search in that file and file name. How can i take these two as inputs? using read command? Getting error for below script.
echo "enter SID"
read SID
echo "enter filename"
read filename... (8 Replies)
Hi all,
I think so I’m getting the result is wrong, while using following awk commend,
colval=$(awk 'FNR>1 && NR==FNR{a=$4;next;} FNR>1 {a+=$4; print $2"\t"a/3}'
filename_f.tsv filename_f2.tsv filename_f3.tsv)
echo $colval >> Result.tsv
it’s doing the condition 2 times, first result... (5 Replies)
cat doc | nawk -v da="${date}" '$23>199 {print $0 > "doc"+da+".txt"}'
Every time(need to run every day) i run this, i want to a create a new file "doc_01 Aug.txt".
Basically, i want to create a new file with date appended in it.
The above command is creating a file with name "0".... (4 Replies)
Hi, I am working with files containing 2 columns in which i need to come up with the frequency/count of values in col. 2 falling within specifics binned values of col. 1. the contents of a sample file is shown below:
15 12.5
15 11.2
16 0.2
16 1.4
17 1.6
18 4.5
17 5.6
12 8.6
11 7.2
9 ... (13 Replies)
Hello! Well, I searched and wasn't able to find a specific example of my dilemma, so hopefully someone could assist? Or maybe there was an example but I missed it?
I have two files:
file1 = order data file
file2 = list of 65,000+ order numbers
I would like to extract from 'file1' any... (5 Replies)
I am writing a awk script that gathers certain data from certain fields. I needed a awk solution for this, because it will later become a function in the script.
I have the following data that I need output on a single line, but record spans across multilple lines and records are not... (7 Replies)
A few questions: I'm trying to use Bash (although I'm not against using AWK) to try to accomplish a few things, but I'm stumped on a few points.
I'm learning most of the basics quickly: but there are a few things I can't figure out.
1. I'm trying to count the number of .txt files in a... (3 Replies)