Sponsored Content
Full Discussion: awk solution for taking bins
Top Forums UNIX for Dummies Questions & Answers awk solution for taking bins Post 302772780 by torchij on Tuesday 26th of February 2013 08:34:10 PM
Old 02-26-2013
awk solution for taking bins

Hi all, I'm looking for an awk solution for taking bins of data set.
For example, if I have two columns of data that I wish to use for a scatter plot, and it contains 5 million lines, how can I take averages of every 100 points, 1000, 10000 etc...
The idea is to take bins of the 5,000,000 points and reduce the density.

Code:
$ cat largefile.txt
x        y
1       45
2       46
3       87
4       34
5       36
6       36
7       23
...     ...
5mil    228

how to take bins every "n" points of y.

Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash/AWK Newbie taking on more than he can chew.

A few questions: I'm trying to use Bash (although I'm not against using AWK) to try to accomplish a few things, but I'm stumped on a few points. I'm learning most of the basics quickly: but there are a few things I can't figure out. 1. I'm trying to count the number of .txt files in a... (3 Replies)
Discussion started by: Asylus
3 Replies

2. Shell Programming and Scripting

Is there a awk solution for this??

I am writing a awk script that gathers certain data from certain fields. I needed a awk solution for this, because it will later become a function in the script. I have the following data that I need output on a single line, but record spans across multilple lines and records are not... (7 Replies)
Discussion started by: timj123
7 Replies

3. Shell Programming and Scripting

Awk solution

Hello! Well, I searched and wasn't able to find a specific example of my dilemma, so hopefully someone could assist? Or maybe there was an example but I missed it? I have two files: file1 = order data file file2 = list of 65,000+ order numbers I would like to extract from 'file1' any... (5 Replies)
Discussion started by: rm -r *
5 Replies

4. Shell Programming and Scripting

Calculating frequency of values within bins

Hi, I am working with files containing 2 columns in which i need to come up with the frequency/count of values in col. 2 falling within specifics binned values of col. 1. the contents of a sample file is shown below: 15 12.5 15 11.2 16 0.2 16 1.4 17 1.6 18 4.5 17 5.6 12 8.6 11 7.2 9 ... (13 Replies)
Discussion started by: ida1215
13 Replies

5. UNIX for Dummies Questions & Answers

taking the output of awk command to a new file

cat doc | nawk -v da="${date}" '$23>199 {print $0 > "doc"+da+".txt"}' Every time(need to run every day) i run this, i want to a create a new file "doc_01 Aug.txt". Basically, i want to create a new file with date appended in it. The above command is creating a file with name "0".... (4 Replies)
Discussion started by: vagar11
4 Replies

6. Shell Programming and Scripting

Cannot get the correct ans. Using awk in taking average

Hi all, I think so I’m getting the result is wrong, while using following awk commend, colval=$(awk 'FNR>1 && NR==FNR{a=$4;next;} FNR>1 {a+=$4; print $2"\t"a/3}' filename_f.tsv filename_f2.tsv filename_f3.tsv) echo $colval >> Result.tsv it’s doing the condition 2 times, first result... (5 Replies)
Discussion started by: Shenbaga.d
5 Replies

7. Shell Programming and Scripting

Taking inputs for awk

Hi, i want to print 2nd column value with the below script. I need to take input of the string i need to search in that file and file name. How can i take these two as inputs? using read command? Getting error for below script. echo "enter SID" read SID echo "enter filename" read filename... (8 Replies)
Discussion started by: sam_bd
8 Replies

8. Shell Programming and Scripting

awk command line arguments not taking

# more minusf.awk #!/bin/awk -f BEGIN { FS=":"; } { if ( $2 == "" ) { print $1 ": no password!"; } } # ./minusf.awk aa aa aa aa awk: can't open aa (6 Replies)
Discussion started by: sri.phani
6 Replies

9. Shell Programming and Scripting

awk to select 2D data bins

I wish to use AWK to do something akin: Select all 2D data with 1<$1<2 and -7.5<$2<-6.5 But it's not working awk 'END {print ($1<=2&&$1>=1&&$2<=-6.5&&$2>=-7.5)}' bla Data: -1.06897 -8.04482 -61.469 -1.13613 -8.04482 -61.2271 -1.00182 -8.04482 -61.2081 -1.06897 -8.13518 -60.8544... (2 Replies)
Discussion started by: chrisjorg
2 Replies

10. UNIX for Beginners Questions & Answers

Create bins with totals and percentage

I would like to create bins to get histogram with totals and percentage, e.g. starting from 0. If possible to set the minimum and maximum value in the bins ( in my case value min=0 and max=20 ) Input file 8 5 10 1 11 4 12 4 12 4 13 5 16 7 18 9 16 9 17 7 18 5 19 5 20 1 21 7 (10 Replies)
Discussion started by: jiam912
10 Replies
PLBIN(3plplot)							    PLplot API							    PLBIN(3plplot)

NAME
plbin - Plot a histogram from binned data SYNOPSIS
plbin(nbin, x, y, opt) DESCRIPTION
Plots a histogram consisting of nbin bins. The value associated with the i'th bin is placed in x[i], and the number of points in the bin is placed in y[i]. For proper operation, the values in x[i] must form a strictly increasing sequence. By default, x[i] is the left-hand edge of the i'th bin. If opt=PL_BIN_CENTRED is used, the bin boundaries are placed midway between the values in the x array. Also see plhist(3plplot) for drawing histograms from unbinned data. Redacted form: General: plbin(x, y, opt) Perl/PDL: plbin(nbin, x, y, opt) Python: plbin(nbin, x, y, opt) This function is not used in any examples. ARGUMENTS
nbin (PLINT, input) Number of bins (i.e., number of values in x and y arrays.) x (PLFLT *, input) Pointer to array containing values associated with bins. These must form a strictly increasing sequence. y (PLFLT *, input) Pointer to array containing number of points in bin. This is a PLFLT (instead of PLINT) array so as to allow histograms of proba- bilities, etc. opt (PLINT, input) Is a combination of several flags: opt=PL_BIN_DEFAULT: The x represent the lower bin boundaries, the outer bins are expanded to fill up the entire x-axis and bins of zero height are simply drawn. opt=PL_BIN_CENTRED|...: The bin boundaries are to be midway between the x values. If the values in x are equally spaced, the values are the center values of the bins. opt=PL_BIN_NOEXPAND|...: The outer bins are drawn with equal size as the ones inside. opt=PL_BIN_NOEMPTY|...: Bins with zero height are not drawn (there is a gap for such bins). AUTHORS
Geoffrey Furnish and Maurice LeBrun wrote and maintain PLplot. This man page was automatically generated from the DocBook source of the PLplot documentation, maintained by Alan W. Irwin and Rafael Laboissiere. SEE ALSO
PLplot documentation at http://plplot.sourceforge.net/resources. August, 2012 PLBIN(3plplot)
All times are GMT -4. The time now is 09:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy