Sponsored Content
Top Forums Shell Programming and Scripting awk to find maximum and minimum from column and store in other column Post 303016957 by RudiC on Monday 7th of May 2018 05:39:33 AM
Old 05-07-2018
Did you read and consider the comments in your other recent thread(s)? A specification that doesn't need reading thrice or even more often helps people help you.
Why the leading white space in the input, and why is that removed in the output?
Are the key values ($1) in contiguous order, or are they scattered through the file? Is that order to be retained?
Which $2 value to retain; should they differ?
How would you define a minimum and / or maximum of the last 8 chars of IN546474DGDGD00, or their difference?
What to do with the values that have just one record (the last two in the sample)?
Why assign values to fields 7 and 8, and then remove fields 5 and 6 resulting in the new fields being 5 and 6?

Last edited by RudiC; 05-07-2018 at 07:00 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

find expression with awk in only one column, and if it fits, print whole column

Hi. How do I find an expression with awk in only one column, and if it fits, then print that whole column. 1 apple oranges 2 bannanas pears 3 cats dogs 4 hesaid shesaid echo "which number:" read NUMBER (user inputs number 2 for this example) awk " /$NUMBER/ {field to search is field... (2 Replies)
Discussion started by: glev2005
2 Replies

2. Shell Programming and Scripting

for each different entry in column 1 extract maximum values from column 2 in unix/awk

Hello, I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2 SDF4 -0.211654 SDF4 0.978068 ... (1 Reply)
Discussion started by: Diya123
1 Replies

3. UNIX for Dummies Questions & Answers

[Solved] Using awk to obtain minimum of each column (ignoring zeros)

Hi, I have a wide and long dataset which looks as follows: 0 3 4 2 3 0 2 2 ... 3 2 4 0 2 2 2 3 ... 0 3 4 2 0 4 4 4 ... 3 0 4 2 2 4 2 4 ... .... I would like to obtain the minimum of each column (ignoring zero values) so the output would look like: 3 2 4 2 2 2 2 2 I have the... (3 Replies)
Discussion started by: kasan0
3 Replies

4. Homework & Coursework Questions

Find the Maximum value and average of a column

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: I am trying to complete a script which will allow me to find: a) reads a value from the keyboard. (ask the... (4 Replies)
Discussion started by: dstewie
4 Replies

5. Shell Programming and Scripting

Extract minimum/maximum using awk

From the below table I want to print highest value and lowest value using awk script. aaa 55 66 96 77 ggg 22 96 77 23 ddd 74 58 18 3 kkk 45 89 47 92 zzz 34 58 89 92 Thanks, Green edit by bakunin: it sure is not news to you that you should use CODE-tags, no? And that we do not want such... (3 Replies)
Discussion started by: gwgreen1
3 Replies

6. UNIX for Dummies Questions & Answers

Using awk to find and use the maximum value in column of data

Dear Unix Gurus, I have a text file with multiple columns, for example, see sample.txt below 0 1 301 1 4 250 2 6 140 3 2 610 7 1 180I want to find the maximum in, say, column 3, normalise all the values to this maximum value (to 4 decimal places) and spit everything into a new... (2 Replies)
Discussion started by: tintin72
2 Replies

7. Shell Programming and Scripting

Find minimum and maximum values based on column with associative array

Hello, I need to find out the minimum and maximum values based on specific column, and then print out the entire row with the max value. Infile.txt: scf6 290173 290416 . + X_047241 T_00113118-1 scf6 290491 290957 . + X_047241 T_00113118-2 scf6 290898 290957 . + X_047241 T_00113119-3 scf6... (2 Replies)
Discussion started by: yifangt
2 Replies

8. Shell Programming and Scripting

awk to select lines with maximum value of each record based on column value

Hello, I want to get the maximum value of each record separated by empty line based on the 3rd column of each row within each record? Input: A1 chr5D 634 7 82 707 A2 chr5D 637 6 82 713 A3 chr5D 637 5 82 713 A4 chr5D 626 1 82 704... (4 Replies)
Discussion started by: yifangt
4 Replies

9. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2... (6 Replies)
Discussion started by: ejianu
6 Replies

10. Programming

Find the minimum value of the column with respect to other column

Hi All, I would like get the minimum value in the certain column with respect to other column. For example, I have a text file like this. ATOM 1 QSS SPH S 0 -2.790 -1.180 -2.282 2.28 2.28 ATOM 1 QSS SPH S 1 -2.915 -1.024 -2.032 2.31 2.31 ATOM 1 ... (4 Replies)
Discussion started by: bala06
4 Replies
RCALC(1)						      General Commands Manual							  RCALC(1)

NAME
rcalc - record calculator SYNOPSIS
rcalc [ -b ][ -l ][ -p ][ -n ][ -w ][ -u ][ -tS ][ -i format ][ -o format ][ -f source ][ -e expr ][ -s svar=sval ] file .. DESCRIPTION
Rcalc transforms ``records'' from each file according to the given set of literal and relational information. By default, records are sep- arated by newlines, and contain numeric fields separated by tabs. The -tS option is used to specify an alternate tab character. A -i format option specifies a template for an alternate input record format. Format is interpreted as a specification string if it con- tains a dollar sign '$'. Otherwise, it is interpreted as the name of the file containing the format specification. In either case, if the format does not end with a newline, one will be added automatically. A special form of the -i option may be followed immediately by a 'd' or an 'f' and an optional count, which defaults to 1, indicating the number of double or float binary values to read per record on the input file. If the input is byte-swapped, the -iD or -iF options may be substituted. If binary input is specified, no format string or file is needed. A -o format option specifies an alternate output record format. It is interpreted the same as an input specification, except that the spe- cial -od or -of options do not require a count, as this will be determined by the number of output channels in the given expressions. If byte-swapped output is desired, the -oD or -oF options may be substituted. The -p option specifies "passive mode," where characters that do not match the input format are passed unaltered to the output. This option has no effect unless -i is also specified, and does not make much sense unless -o is also given. With both input and output for- mats, the passive mode can effectively substitute information in the middle of a file or stream without affecting the rest of the data. The variable and function definitions in each -f source file are read and compiled. The -e expr option can be used to define variables on the command line. Since many of the characters in an expression have special meaning to the shell, it should usually be enclosed in single quotes. The -s svar=sval option can be used to assign a string variable a string value. If this string variable appears in an input for- mat, only records with the specified value will be processed. The -b option instructs the program to accept only exact matches. By default, tabs and spaces are ignored except as field separators. The -l option instructs the program to ignore newlines in the input, basically treating them the same as tabs and spaces. Normally, the begin- ning of the input format matches the beginning of a line, and the end of the format matches the end of a line. With the -l option, the input format can match anywhere on a line. The -w option causes non-fatal error messages (such as division by zero) to be supressed. The -u option causes output to be flushed after each record. The -n option tells the program not to get any input, but to produce a single output record. Otherwise, if no files are given, the standard input is read. Format files associate names with string and numeric fields separated by literal information in a record. A numeric field is given in a format file as a dollar sign, followed by curly braces enclosing a variable name: This is a numeric field: ${vname} A string variable is enclosed in parentheses: This is a string field: $(sname) The program attempts to match literal information in the input format to its input and assign string and numeric fields accordingly. If a string or numeric field variable appears more than once in the input format, input values for the corresponding fields must match (ie. have the same value) for the whole record to match. Numeric values are allowed some deviation, on the order of 0.1%, but string variables must match exactly. Thus, dummy variables for "don't care" fields should be given unique names so that they are not all required to take on the same value. For each valid input record, an output record is produced in its corresponding format. Output field widths are given implicitly by the space occupied in the format file, including the dollar sign and braces. This makes it impossible to produce fields with fewer than four characters. If the -b option is specified, input records must exactly match the template. By default, the character following each input field is used as a delimiter. This implies that string fields that are followed by white space cannot contain strings with white space. Also, numeric fields followed but not preceded by white space will not accept numbers preceded by white space. Adjacent input fields are advisable only with the -b option. Numeric output fields may contain expressions as well as variables. A dollar sign may appear in a lit- eral as two dollar signs ($$). The definitions specified in -e and -f options relate numeric output fields to numeric input fields. For the default record format, a field is a variable of the form $N, where N is the column number, beginning with 1. Output columns appear on the left-hand side of assign- ments, input columns appear on the right-hand side. A variable definition has the form: var = expression ; Any instance of the variable in an expression will be replaced with its definition. An expression contains real numbers, variable names, function calls, and the following operators: + - * / ^ Operators are evaluated left to right. Powers have the highest precedence; multiplication and division are evaluated before addition and subtraction. Expressions can be grouped with parentheses. All values are double precision real. A function definition has the form: func(a1, a2, ..) = expression ; The expression can contain instances of the function arguments as well as other variables and functions. Function names can be passed as arguments. Recursive functions can be defined using calls to the defined function or other functions calling the defined function. The variable cond, if defined, will determine whether the current input record produces an output record. If cond is positive, output is produced. If cond is less than or equal to zero, the record is skipped and no other expressions are evaluated. This provides a convenient method for avoiding inappropriate calculations. The following library of pre-defined functions and variables is provided: in(n) Return the value for input column n, or the number of columns available in this record if n is 0. This is an alternate way to get a column value instead of using the $N notation, and is more flexible since it is programmable. This function is disabled if an input format is used. if(cond, then, else) if cond is greater than zero, then is evaluated, otherwise else is evaluated. This function is necessary for recursive defini- tions. select(N, a1, a2, ..) return aN (N is rounded to the nearest integer). This function provides array capabilities. If N is zero, the number of avail- able arguments is returned. rand(x) compute a random number between 0 and 1 based on x. floor(x) return largest integer not greater than x. ceil(x) return smallest integer not less than x. sqrt(x) return square root of x. exp(x) compute e to the power of x (e approx = 2.718281828). log(x) compute the logarithm of x to the base e. log10(x) compute the logarithm of x to the base 10. PI the ratio of a circle's circumference to its diameter. recno the number of records recognized thus far. outno the number or records output thus far (including this one). sin(x), cos(x), tan(x) trigonometric functions. asin(x), acos(x), atan(x) inverse trigonometric functions. atan2(y, x) inverse tangent of y/x (range -pi to pi). EXAMPLE
To print the square root of column two in column one, and column one times column three in column two: rcalc -e '$1=sqrt($2);$2=$1*$3' inputfile > outputfile AUTHOR
Greg Ward BUGS
String variables can only be used in input and output formats and -s options, not in definitions. Tabs count as single spaces inside fields. SEE ALSO
cnt(1), ev(1), getinfo(1), icalc(1), rlam(1), tabfunc(1), total(1) RADIANCE
4/6/99 RCALC(1)
All times are GMT -4. The time now is 12:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy