Mawk printf %d maxes out at 2147483647

Mawk printf %d maxes out at 2147483647
01-10-2014
Mawk printf %d maxes out at 2147483647

So, I do some file processing that generates very large numbers, such as total amount GETted from a busy web cluster in a month, etc. Mawk is awesome-- fast and easy. It's awk! But, there's a fatal flaw that I'd like to overcome. Apparently, %d maxes out at 2147483647. Here's sample output, with the first line sprintf'd and the second just print'd:

total_size, total_count, average:       2147483647 50586 2493242
total_size, total_count, average:       1.26123e+11 50586 2.49324e+06

Is there any way I'm not thinking of to achieve the same result as %d? I'm using the very latest, bright and shiny mawk:

# mawk -W version
mawk 1.3.4 20131226

There are bug reports out there about this (google "mawk 2147483647") but no solutions so far. May thanks in advance, and please pardon me if I've overlooked some solution. I was a bit fatigued when I went looking.
01-10-2014
Try using "gawk".
01-10-2014
Originally Posted by bartus11
Try using "gawk".
That is a solution, of course, but I'm using mawk because of the size of the datasets we're working with. Depending on the task, we have anywhere from 3 to 7 times better performance. So, it's been an ongoing project to convert those tasks that we can to mawk. I'd like to do so for this one as well, as it is a very time-consuming job.
01-10-2014
Try this then:
printf "%.0f\n", 2147483648

01-10-2014
Originally Posted by bartus11
Try this then:
printf "%.0f\n", 2147483648

Ah, of course. That works. :-) Don't know why I assumed that would also be broken. Many thanks.
01-10-2014
I've dug into mawk's code a bit and switching it to a 64-bit integer isn't quite as easy as it seems. It's a sticky problem, because of the mutability of numbers in awk. They are quite careful to get a 32-bit int and a 64-bit double, since all 32-bit integers can be faithfully represented by a 64-bit float, but what happens when your int is 64-bit? Not all 64-bit integers can be perfectly represented by the 53-bits precision of a 64-bit float.

It also passes on its printf options into the system printf's, almost completely faithfully, except for a weird case they added in 1995 for a system that only had 16-bit ints. I suspect another such weird case would be needed for 64-bits.

Last edited by Corona688; 01-10-2014 at 07:49 PM..
01-10-2014
In fact I'd go far enough to say... The mawk developers might do better, in readability and performance, to write their own printf. Cooperating with every awkward printf of the last 20 years has made it very strange internally.
