Sponsored Content
Top Forums Programming Problem with rounding using lrint Post 302985096 by Don Cragun on Friday 4th of November 2016 05:29:41 PM
Old 11-04-2016
You need to determine what rounding rules you want to use for tie-breaking cases (see Wikipedia's discussion on rounding)
and you need to realize that even in double precision floating point, the result of a floating point calculation is not always exact (even when computing a value using decimal calculations would be exact). For example, the awk program (which uses double precision floating point for its calculations):
Code:
printf '.1 .5\n.3 .5\n' | awk '{printf("%.40f\n", $1 * $2)}'

produces the output:
Code:
0.0500000000000000027755575615628913510591
0.1499999999999999944488848768742172978818

not the output you might expect:
Code:
0.0500000000000000000000000000000000000000
0.1500000000000000000000000000000000000000

Which shows two examples where decimal arithmetic produces exact results, but binary arithmetic produced one result that was a little bit high and one result that was a little bit low.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Rounding off using BC.

Hello again. I'm trying to use BC to calculate some numbers in a shell script. I want to have the numbers rounded off to 1 decimal place. for example: initsize=1566720 zipsize=4733 I'm trying to get the ratio between them. the equation is: (($initsize-$zipsize)/$initsize)*100 so... (3 Replies)
Discussion started by: noodlesoup
3 Replies

2. UNIX for Dummies Questions & Answers

Rounding problem

Hi, Can any one help me in finding a solution for rounding off to 2 decimal places. I am using the following code: VAR1=.01292105263157894736 VAR2=`echo "scale=2; $VAR1 * 100" | bc -l` The result I 'm getting is 1.29210526315789473600 But I need the output as 1.29 Thanks Shash (2 Replies)
Discussion started by: shash
2 Replies

3. Shell Programming and Scripting

Rounding off the value of Floating point value

Hello, i have some variables say: x=1.4 y=3.7 I wish to round off these values to : x = 2 (after rounding off) y = 4 (after rounding off) I am stuck. Please help. (7 Replies)
Discussion started by: damansingh
7 Replies

4. Shell Programming and Scripting

Rounding off to the next whole number

Hello, I searched a lot on this Forum. Please help me with the below problem. I want to divide two numbers and the result should be the next nearest whole number. E.G. Dividing 10.8/5 ideally gives 2.16. But the result should be 3 i.e. rounded off to the next whole number. Any help will... (2 Replies)
Discussion started by: damansingh
2 Replies

5. Linux

Rounding Script Help

I need some help with my rouding script. I have started pretty much from scratch and have no idea if its correct or even close but I have been trying and have gotten to this point. i keep getting syntax errors and im not sure what is wrong. Here is what I got let value=$1; while do let... (0 Replies)
Discussion started by: kingrj46
0 Replies

6. Shell Programming and Scripting

Rounding Script Help

I need some help with my rouding script. I have started pretty much from scratch and have no idea if its correct or even close but I have been trying and have gotten to this point. i keep getting syntax errors and im not sure what is wrong. Here is what I got let value=$1; while do let... (4 Replies)
Discussion started by: kingrj46
4 Replies

7. UNIX for Dummies Questions & Answers

Rounding a decimal

Hi, I am currently using tcsh I am trying to round a decimal number to the ten-thousandths place For instance: 1.23456 is rounded up towards 1.2346 I am not looking for truncation, but for rounding. Anyone know how to do this with awk or expr? Thanks (2 Replies)
Discussion started by: miniwheats
2 Replies

8. Shell Programming and Scripting

Rounding number, but....

Dear Experts, I'm trying to find a way to round a number but in this way: 14367.577 ---> 14000 I used the following to round the number to the closer integer: echo $var|awk '{print int($1+0.5)}' and also: xargs printf "%1.0f" However, they don't work for my above... (9 Replies)
Discussion started by: Gery
9 Replies

9. UNIX for Dummies Questions & Answers

Rounding up to nearest whole number

Hi all of you, Would be great if you help me with how to round up to whole number from my input values like 2.99996,2.17890,3.00002,-2.3456,-2.7890 o/p should be like 3,2,3,-2,-3 thnks in adv!!!! regards (3 Replies)
Discussion started by: Indra2011
3 Replies

10. UNIX for Dummies Questions & Answers

Rounding off a decimal

How to round off a decimal number to higher whole number using ceil command in unix? Eg. 4.41 or 4.11 or 4.51 should be rounded off to 5. (11 Replies)
Discussion started by: SanjayKumar28
11 Replies
FLOAT(3)						   BSD Library Functions Manual 						  FLOAT(3)

NAME
float -- description of floating-point types available on OS X and iOS DESCRIPTION
This page describes the available C floating-point types. For a list of math library functions that operate on these types, see the page on the math library, "man math". TERMINOLOGY
Floating point numbers are represented in three parts: a sign, a mantissa (or significand), and an exponent. Given such a representation with sign s, mantissa m, and exponent e, the corresponding numerical value is s*m*2**e. Floating-point types differ in the number of bits of accuracy in the mantissa (called the precision), and set of available exponents (the exponent range). Floating-point numbers with the maximum available exponent are reserved operands, denoting an infinity if the significand is precisely zero, and a Not-a-Number, or NaN, otherwise. Floating-point numbers with the minimum available exponent are either zero if the significand is precisely zero, and denormal otherwise. Note that zero is signed: +0 and -0 are distinct floating point numbers. Floating-point numbers with exponents other than the maximum and minimum available are called normal numbers. PROPERTIES OF IEEE-754 FLOATING-POINT Basic arithmetic operations in IEEE-754 floating-point are correctly rounded: this means that the result delivered is the same as the result that would be achieved by computing the exact real-number operation on the operands, then rounding the real-number result to a floating-point value. Overflow occurs when the value of the exact result is too large in magnitude to be represented in the floating-point type in which the compu- tation is being performed; doing so would require an exponent outside of the exponent range of the type. By default, computations that result in overflow return a signed infinity. Underflow occurs when the value of the exact result is too small in magnitude to be represented as a normal number in the floating-point type in which the computation is being performed. By default, underflow is gradual, and produces a denormal number or a zero. All floating-points number of a given type are integer multiples of the smallest non-zero floating-point number of that type; however, the converse is not true. This means that, in the default mode, (x-y) = 0 only if x = y. The sign of zero transforms correctly through multiplication and division, and is preserved by addition of zeros with like signs, but x - x yields +0 for every finite floating-point number x. The only operations that reveal the sign of a zero are x/(+-0) and copysign(x,+-0). In particular, comparisons (x > y, x != y, etc) are not affected by the sign of zero. The sign of infinity transforms correctly through multiplication and division, and infinities are unaffected by addition or subtraction of any finite floating-point number. But Inf-Inf, Inf*0, and Inf/Inf are, like 0/0 or sqrt(-3), invalid operations that produce NaN. NaNs are the default results of invalid operations, and they propagate through subsequent arithmetic operations. If x is a NaN, then x != x is TRUE, and every other comparison predicate (x > y, x = y, x <= y, etc) evaluates to FALSE, regardless of the value of y. Additionally, predicates that entail an ordered comparison (rather than mere equality or inequality) signal Invalid Operation when one of the arguments is NaN. IEEE-754 provides five kinds of floating-point exceptions, listed below: Exception Default Result __________________________________________ Invalid Operation NaN or FALSE Overflow +-Infinity Divide by Zero +-Infinity Underflow Gradual Underflow Inexact Rounded Value NOTE: An exception is not an error unless it is handled incorrectly. What makes a class of exceptions exceptional is that no single default response can be satisfactory in every instance. On the other hand, because a default response will serve most instances of the exception satisfactorily, simply aborting the computation cannot be justified. For each kind of floating-point exception, IEEE-754 provides a flag that is raised each time its exception is signaled, and remains raised until the program resets it. Programs may test, save, and restore the flags, or a subset thereof. PRECISION AND EXPONENT RANGE OF SPECIFIC FLOATING-POINT TYPES On both OS X and iOS, the type float corresponds to IEEE-754 single precision. A single-precision number is represented in 32 bits, and has a precision of 24 significant bits, roughly like 7 significant decimal digits. 8 bits are used to encode the exponent, which gives an expo- nent range from -126 to 127, inclusive. The header <float.h> defines several useful constants for the float type: FLT_MANT_DIG - The number of binary digits in the significand of a float. FLT_MIN_EXP - One more than the smallest exponent available in the float type. FLT_MAX_EXP - One more than the largest exponent available in the float type. FLT_DIG - the precision in decimal digits of a float. A decimal value with this many digits, stored as a float, always yields the same value up to this many digits when converted back to decimal notation. FLT_MIN_10_EXP - the smallest n such that 10**n is a non-zero normal number as a float. FLT_MAX_10_EXP - the largest n such that 10**n is finite as a float. FLT_MIN - the smallest positive normal float. FLT_MAX - the largest finite float. FLT_EPSILON - the difference between 1.0 and the smallest float bigger than 1.0. On both OS X and iOS, the type double corresponds to IEEE-754 double precision. A double-precision number is represented in 64 bits, and has a precision of 53 significant bits, roughly like 16 significant decimal digits. 11 bits are used to encode the exponent, which gives an exponent range from -1022 to 1023, inclusive. The header <float.h> defines several useful constants for the double type: DBL_MANT_DIG - The number of binary digits in the significand of a double. DBL_MIN_EXP - One more than the smallest exponent available in the double type. DBL_MAX_EXP - One more than the exponent available in the double type. DBL_DIG - the precision in decimal digits of a double. A decimal value with this many digits, stored as a double, always yields the same value up to this many digits when converted back to decimal notation. DBL_MIN_10_EXP - the smallest n such that 10**n is a non-zero normal number as a double. DBL_MAX_10_EXP - the largest n such that 10**n is finite as a double. DBL_MIN - the smallest positive normal double. DBL_MAX - the largest finite double. DBL_EPSILON - the difference between 1.0 and the smallest double bigger than 1.0. On Intel macs, the type long double corresponds to IEEE-754 double extended precision. A double extended number is represented in 80 bits, and has a precision of 64 significant bits, roughly like 19 significant decimal digits. 15 bits are used to encode the exponent, which gives an exponent range from -16383 to 16384, inclusive. The header <float.h> defines several useful constants for the long double type: LDBL_MANT_DIG - The number of binary digits in the significand of a long double. LDBL_MIN_EXP - One more than the smallest exponent available in the long double type. LDBL_MAX_EXP - One more than the exponent available in the long double type. LDBL_DIG - the precision in decimal digits of a long double. A decimal value with this many digits, stored as a long double, always yields the same value up to this many digits when converted back to decimal notation. LDBL_MIN_10_EXP - the smallest n such that 10**n is a non-zero normal number as a long double. LDBL_MAX_10_EXP - the largest n such that 10**n is finite as a long double. LDBL_MIN - the smallest positive normal long double. LDBL_MAX - the largest finite long double. LDBL_EPSILON - the difference between 1.0 and the smallest long double bigger than 1.0. On ARM iOS devices, the type long double corresponds to IEEE-754 double precision. Thus, the values of the LDBL_* macros are identical to those of the corresponding DBL_* macros. SEE ALSO
math(3), complex(3) STANDARDS
Floating-point arithmetic conforms to the ISO/IEC 9899:2011 standard. BSD
March 28, 2007 BSD
All times are GMT -4. The time now is 05:20 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy