Sponsored Content
Full Discussion: awk solution for taking bins
Top Forums UNIX for Dummies Questions & Answers awk solution for taking bins Post 302772780 by torchij on Tuesday 26th of February 2013 08:34:10 PM
Old 02-26-2013
awk solution for taking bins

Hi all, I'm looking for an awk solution for taking bins of data set.
For example, if I have two columns of data that I wish to use for a scatter plot, and it contains 5 million lines, how can I take averages of every 100 points, 1000, 10000 etc...
The idea is to take bins of the 5,000,000 points and reduce the density.

Code:
$ cat largefile.txt
x        y
1       45
2       46
3       87
4       34
5       36
6       36
7       23
...     ...
5mil    228

how to take bins every "n" points of y.

Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash/AWK Newbie taking on more than he can chew.

A few questions: I'm trying to use Bash (although I'm not against using AWK) to try to accomplish a few things, but I'm stumped on a few points. I'm learning most of the basics quickly: but there are a few things I can't figure out. 1. I'm trying to count the number of .txt files in a... (3 Replies)
Discussion started by: Asylus
3 Replies

2. Shell Programming and Scripting

Is there a awk solution for this??

I am writing a awk script that gathers certain data from certain fields. I needed a awk solution for this, because it will later become a function in the script. I have the following data that I need output on a single line, but record spans across multilple lines and records are not... (7 Replies)
Discussion started by: timj123
7 Replies

3. Shell Programming and Scripting

Awk solution

Hello! Well, I searched and wasn't able to find a specific example of my dilemma, so hopefully someone could assist? Or maybe there was an example but I missed it? I have two files: file1 = order data file file2 = list of 65,000+ order numbers I would like to extract from 'file1' any... (5 Replies)
Discussion started by: rm -r *
5 Replies

4. Shell Programming and Scripting

Calculating frequency of values within bins

Hi, I am working with files containing 2 columns in which i need to come up with the frequency/count of values in col. 2 falling within specifics binned values of col. 1. the contents of a sample file is shown below: 15 12.5 15 11.2 16 0.2 16 1.4 17 1.6 18 4.5 17 5.6 12 8.6 11 7.2 9 ... (13 Replies)
Discussion started by: ida1215
13 Replies

5. UNIX for Dummies Questions & Answers

taking the output of awk command to a new file

cat doc | nawk -v da="${date}" '$23>199 {print $0 > "doc"+da+".txt"}' Every time(need to run every day) i run this, i want to a create a new file "doc_01 Aug.txt". Basically, i want to create a new file with date appended in it. The above command is creating a file with name "0".... (4 Replies)
Discussion started by: vagar11
4 Replies

6. Shell Programming and Scripting

Cannot get the correct ans. Using awk in taking average

Hi all, I think so I’m getting the result is wrong, while using following awk commend, colval=$(awk 'FNR>1 && NR==FNR{a=$4;next;} FNR>1 {a+=$4; print $2"\t"a/3}' filename_f.tsv filename_f2.tsv filename_f3.tsv) echo $colval >> Result.tsv it’s doing the condition 2 times, first result... (5 Replies)
Discussion started by: Shenbaga.d
5 Replies

7. Shell Programming and Scripting

Taking inputs for awk

Hi, i want to print 2nd column value with the below script. I need to take input of the string i need to search in that file and file name. How can i take these two as inputs? using read command? Getting error for below script. echo "enter SID" read SID echo "enter filename" read filename... (8 Replies)
Discussion started by: sam_bd
8 Replies

8. Shell Programming and Scripting

awk command line arguments not taking

# more minusf.awk #!/bin/awk -f BEGIN { FS=":"; } { if ( $2 == "" ) { print $1 ": no password!"; } } # ./minusf.awk aa aa aa aa awk: can't open aa (6 Replies)
Discussion started by: sri.phani
6 Replies

9. Shell Programming and Scripting

awk to select 2D data bins

I wish to use AWK to do something akin: Select all 2D data with 1<$1<2 and -7.5<$2<-6.5 But it's not working awk 'END {print ($1<=2&&$1>=1&&$2<=-6.5&&$2>=-7.5)}' bla Data: -1.06897 -8.04482 -61.469 -1.13613 -8.04482 -61.2271 -1.00182 -8.04482 -61.2081 -1.06897 -8.13518 -60.8544... (2 Replies)
Discussion started by: chrisjorg
2 Replies

10. UNIX for Beginners Questions & Answers

Create bins with totals and percentage

I would like to create bins to get histogram with totals and percentage, e.g. starting from 0. If possible to set the minimum and maximum value in the bins ( in my case value min=0 and max=20 ) Input file 8 5 10 1 11 4 12 4 12 4 13 5 16 7 18 9 16 9 17 7 18 5 19 5 20 1 21 7 (10 Replies)
Discussion started by: jiam912
10 Replies
PLHIST(3plplot) 						    PLplot API							   PLHIST(3plplot)

NAME
plhist - Plot a histogram from unbinned data SYNOPSIS
plhist(n, data, datmin, datmax, nbin, opt) DESCRIPTION
Plots a histogram from n data points stored in the array data. This routine bins the data into nbin bins equally spaced between datmin and datmax, and calls plbin(3plplot) to draw the resulting histogram. Parameter opt allows, among other things, the histogram either to be plotted in an existing window or causes plhist(3plplot) to call plenv(3plplot) with suitable limits before plotting the histogram. Redacted form: plhist(data, datmin, datmax, nbin, opt) This function is used in example 5. ARGUMENTS
n (PLINT, input) Number of data points. data (PLFLT *, input) Pointer to array with values of the n data points. datmin (PLFLT, input) Left-hand edge of lowest-valued bin. datmax (PLFLT, input) Right-hand edge of highest-valued bin. nbin (PLINT, input) Number of (equal-sized) bins into which to divide the interval xmin to xmax. opt (PLINT, input) Is a combination of several flags: opt=PL_HIST_DEFAULT: The axes are automatically rescaled to fit the histogram data, the outer bins are expanded to fill up the entire x-axis, data outside the given extremes are assigned to the outer bins and bins of zero height are simply drawn. opt=PL_HIST_NOSCALING|...: The existing axes are not rescaled to fit the histogram data, without this flag, plenv(3plplot) is called to set the world coordinates. opt=PL_HIST_IGNORE_OUTLIERS|...: Data outside the given extremes are not taken into account. This option should probably be combined with opt=PL_HIST_NOEXPAND|..., so as to properly present the data. opt=PL_HIST_NOEXPAND|...: The outer bins are drawn with equal size as the ones inside. opt=PL_HIST_NOEMPTY|...: Bins with zero height are not drawn (there is a gap for such bins). AUTHORS
Geoffrey Furnish and Maurice LeBrun wrote and maintain PLplot. This man page was automatically generated from the DocBook source of the PLplot documentation, maintained by Alan W. Irwin and Rafael Laboissiere. SEE ALSO
PLplot documentation at http://plplot.sourceforge.net/resources. August, 2012 PLHIST(3plplot)
All times are GMT -4. The time now is 10:34 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy