... which did a good job of confusing me to your point. I still don't understand what your benchmarks are supposed to prove. Luckily it doesn't matter.
Re-reading my post (#11), I see how it can be misunderstood. I did not intend for the first paragraph to have anything to do with the rest of the post. I should have indicated that clearly (either with language or formatting) or I should have made it a separate post.
When the second paragraph begins, Even so, memset() is largely irrelevant in this scenario, the scenario I'm referring to has nothing to do with the memset "benchmark" in the immediately preceding, opening paragraph. I was referring instead to what had been the topic of the thread at that point, a singularly large allocation, and how it's handled by calloc() in today's open source systems. (I'm curious if the proprietary unices behave similarly. I assume so, but I have no specific information.)
The memset benchmark isn't intended to prove anything except that memset-ing 2 GB takes on the order of a fraction of a second rather than a minute or an hour. Nothing more. As far as benchmarks go, it wasn't a particularly ambitious one.
Thanks for the explanation, and sorry for being dense in my initial reading of it.
Quote:
Originally Posted by alister
The memset benchmark isn't intended to prove anything except that memset-ing 2 GB takes on the order of a fraction of a second rather than a minute or an hour. Nothing more.
Memsetting one gig of RAM takes 1.4 seconds for me. Imagine the amount of actual work that computer could've done in that time instead. Congratulations on your fast computer though.
Thanks for the explanation, and sorry for being dense in my initial reading of it. Memsetting one gig of RAM takes 1.4 seconds for me. Imagine the amount of actual work that computer could've done in that time instead. Congratulations on your fast computer though.
Yeah man, honestly I don't know what he's talking about with his benchmark. I just ran my own really quick test comapring a calloc'ing vs the malloc -> memset sequence and they are darn near identical. Which makes sense because calloc has to bring in pages and once you touch the malloc'd page it has to be pulled in too. Worse, my laptop only has a gig of ram, so deity forbid I attempt to calloc or malloc and memset a gig, I'll start swapping! But, unsurprisingly, I can call malloc for a gig of ram and it'll return immediately. So long as I don't touch the pages, it'll never slow down.
Point is, calloc is different than malloc and shouldn't be used unless you need all your memory zero'd. And really, what application does? Most malloc's would be followed by something "useful" like a memcpy or filling in the malloc'd memory with useful data. Further, you most certainly wouldn't use calloc for a sparse array, that'd just be crazy.
P.S. here's uname -a
edit: I just ran it on a work machine, memcpy followed by memset for 1 GB and calloc for 1 GB were also identical and about 2 seconds. This is on an P570 frame with 12 GB of memory and a 2 CPU's allocated. So...I'd love to know what computer does it in "fractions of a second".
edit2: I suppose I also stirred this up, by saying "I'll wait" as if to imply it'd be ages. But, in computer terms, 2 seconds is "ages". Plus, if you had to swap, it'd really be "I'll wait" because on my poor laptop with 1 GB of RAM, asking it to memset (or calloc) a gig started it swapping; my music in the background was starting to skip and the harddrive started to spin as memory was being paged to disc. It was BAD, lol. After I killed the process, it still took about 10 seconds for the poor thing to normalize, and my music player hung and wouldn't come back, so I had to kill it, lol. Fortunately, the same program without the memset (and just the malloc) ran and ended immediately, because it just pulled in address space to the process, never a physical page, and so never did any actual work. Hence the BIG difference between malloc and calloc that started this whole off topic thread of communication.
Last edited by DreamWarrior; 10-20-2011 at 09:26 PM..
Point is, calloc is different than malloc and shouldn't be used unless you need all your memory zero'd.
alister explained why this is irrelevant for large amounts of memory: 1) it doesn't bother, because 2) it maps it in with mmap instead, meaning 3) the kernel does it for you at the time of paging in and not before.
alister explained why this is irrelevant for large amounts of memory: 1) it doesn't bother, because 2) it maps it in with mmap instead, meaning 3) the kernel does it for you at the time of paging in and not before.
Seems that's not true for both the kernels I tested with. They both took a performance hit identical to malloc+memset (which means they both bring in pages). In fact, I'd bet the actual zero'ing itself is not the problem, it's the creating physical pages that is. Either way, on every machine (three thus far) I've ran a quick calloc(1gb, 1) or malloc(1gb) -> memset(p, '0', 1gb) comparison, they are indistinguishable so far as performance is concerned. Both heartily lose out to a plain malloc(1gb), which is instantaneous.
Regardless...I'll stick to my guns, calloc is pointless; use malloc and initialize the memory in-situ as appropriate afterwards.
No. It does not. It may, but nothing requires it. Small allocations may be handled by already resident pages. Large allocations are mmap'd and since those pages will be zeroed by the kernel, calloc doesn't need to touch them. None of the callocs in in any of the standard c libraries used by the popular open source unix flavors (I looked at Linux/glibc, FreeBSD, NetBSD, and OpenBSD) will call memset to zero a page which will already be zeroed by the kernel before being made available to the process.
The most obvious explanation for why your system shows no difference between malloc+memset and calloc is that your c library's calloc is naive. Or perhaps your code is flawed. Or perhaps your kernel vm subsystem is prefaulting for some reason. Or perhaps your system's environment has enabled malloc/calloc options which affect their behavior (such as filling the allocation with "junk" or zeroes). Perhaps one of the bazillion linux kernel compile options is to blame. If it were my system, I'd look into it just to satisfy my curiosity.
Quote:
Point is, calloc is different than malloc and shouldn't be used unless you need all your memory zero'd.
Obviously. My point is only that under certain conditions malloc and calloc are practically identical (both will return zeroed memory without calling memset). See for yourself in the malloc.c source links I provided in an earlier post. You'll find that both are implemented using the same internal routines. Further, if you follow the code path for a large allocation, you'll see that a calloc never memsets (unless certain options which are disabled by default are enabled).
Quote:
I just ran it on a work machine, memcpy followed by memset for 1 GB and calloc for 1 GB were also identical and about 2 seconds. This is on an P570 frame with 12 GB of memory and a 2 CPU's allocated. So...I'd love to know what computer does it in "fractions of a second".
I must retract my earlier quarter of a second figure. I cannot reproduce it. I must have misread the value. Perhaps it was 2.50s instead of 0.25s.
Here's some code and timings from OS X running on a 2.16 GHz Core2Duo Macbook with 2 GB of 667 MHz DDR2 (similar results were observed using NetBSD on similar hardware). Without any command line arguments, the executable will attempt to calloc 1 GiB. With command line arguments, it will malloc and memset 1 GiB:
Quote:
I'll stick to my guns, calloc is pointless; use malloc and initialize the memory in-situ as appropriate afterwards.
That's your prerogative, but, for a large allocation with a reasonably recent C library, you're choosing to use memset to zero malloc'd memory that is probably already zeroed, instead of using calloc, which knows whether the memory is already zeroed and can avoid the overhead of a redundant memset.
So long as they're not aimed at my code, use your guns as you see fit.
No. It does not. It may, but nothing requires it. Small allocations may be handled by already resident pages. Large allocations are mmap'd and since those pages will be zeroed by the kernel, calloc doesn't need to touch them. None of the callocs in in any of the standard c libraries used by the popular open source unix flavors (I looked at Linux/glibc, FreeBSD, NetBSD, and OpenBSD) will call memset to zero a page which will already be zeroed by the kernel before being made available to the process.
And that's fine. It still seems to, on my systems, require the pages to be backed immediately and that's time consuming. It certainly still performs very similarly to malloc + memset. In fact, allocating 700 MB (all my poor laptop can handle without swapping) it is only 40ms faster than malloc + memset. While this is an eternity in computer time, given the total time for the calloc is about 700 ms, that meager 40 ms savings tells me that the bulk of the time is spent backing the pages, a job which both the first memset after a malloc and, apparently on my system, calloc need to do.
Quote:
Originally Posted by alister
The most obvious explanation for why your system shows no difference between malloc+memset and calloc is that your c library's calloc is naive. Or perhaps your code is flawed. Or perhaps your kernel vm subsystem is prefaulting for some reason. Or perhaps your system's environment has enabled malloc/calloc options which affect their behavior (such as filling the allocation with "junk" or zeroes). Perhaps one of the bazillion linux kernel compile options is to blame. If it were my system, I'd look into it just to satisfy my curiosity.
I'm sure there are many reasons, but at work I'm not the sys admin, so I don't configure the systems. I just code effectively for the systems as configured. Further, my personal system is a stock Ubuntu system, an arguably popular *nix choice. So, any code I wrote for that would fall victim to calloc's performance.
My point is, to some extent, you can code either the system you're running on or an expected worse case. In my case, it happens to be I'm running on the worse case. That means, possibly, you may consider turning away from calloc.
Quote:
Originally Posted by alister
Obviously. My point is only that under certain conditions malloc and calloc are practically identical (both will return zeroed memory without calling memset). See for yourself in the malloc.c source links I provided in an earlier post. You'll find that both are implemented using the same internal routines. Further, if you follow the code path for a large allocation, you'll see that a calloc never memsets (unless certain options which are disabled by default are enabled).
Splendid, but you're 0 for 3 on systems I have available to me. Those being two AIX machines at various O/S levels (5.3 and 6) and Ubuntu for which I provided a uname for prior and here's the libc version:
While I know it's not you're job, nor am I asking you, to figure out why these systems don't perform as you say, I'm simply supplying information to show you that it's not like I'm running some obscure setup. If I wrote code that used calloc for large allocations on any of these I'd be doing myself a disservice over regular malloc, if I didn't need zero'd memory. Time and time again I've been making that point -- is it falling on deaf ears?
Quote:
Originally Posted by alister
I must retract my earlier quarter of a second figure. I cannot reproduce it. I must have misread the value. Perhaps it was 2.50s instead of 0.25s.
Here's some code and timings from OS X running on a 2.16 GHz Core2Duo Macbook with 2 GB of 667 MHz DDR2 (similar results were observed using NetBSD on similar hardware). Without any command line arguments, the executable will attempt to calloc 1 GiB. With command line arguments, it will malloc and memset 1 GiB:
Ok, well here's some code I wrote:
So, let's see, the first calloc took FOREVER (but, for the record, the first malloc+memset would have too for the O/S to put some pages together). Subsequent calloc and malloc+memset perform within 40-50ms as I alternate testing them.
How about a straight malloc, though represented and timed above, here's the program just doing that:
Better:
I don't think anyone's surprised though, right?
Now, let's see after we have all those pages put together, backed, and ready to go; what's a "memset" really take:
Humm...so, about a third the time of the original allocation. Surprised? Well, I mean, it still sucks, so I certainly wouldn't go ahead and advocate calloc followed by a memset for 0, that'd just be dumb...oh wait, I never did!
Quote:
Originally Posted by alister
That's your prerogative, but, for a large allocation with a reasonably recent C library, you're choosing to use memset to zero malloc'd memory that is probably already zeroed, instead of using calloc, which knows whether the memory is already zeroed and can avoid the overhead of a redundant memset.
So long as they're not aimed at my code, use your guns as you see fit.
Regards,
Alister
HAHA, my friend, I believe I can code just fine. I know what I'm talking about, and I believe you do as well.
At the end of the day, I'm the one developing for my systems whose integrity I have to maintain. A 2 second calloc call would be disastrous in my environment. I can spread that over the duration of the code's runtime by calling malloc and gracefully filling in the memory as needed with useful values (which are almost never all zeros) and allow the O/S to more gracefully back the pages as each one is touched rather than immediately (which appears to be what calloc is doing on my systems).
Furthermore, if I haven't already driven home my point that calloc isn't good, we'll run the final test I've created in the suite. It simulates only touching part of each allocated page, something akin to what a sparse array may do. I'm mallocing some memory and then only using some of it. The key is, to do it on a machine that is overcome, in my case that's to allocate a gig when that's all the RAM I have. Here are some "normal" results:
What you'll notice here is that the entire program time is consistently longer for calloc than malloc. If you figure that I'm using the system page size (and I am) I should be touching each page once, forcing each page to be physically created.
So, now lets thrash my poor laptop to death. It only has a gig, so this will do it, lots of swapping, and disc work....
Now...wait for it.... I'll run the same code with the only difference being calloc was called to allocate the memory (you can see the code, right):
Hummm...204 seconds, 10 times the malloc version! WHAT?! Now, what say you? Oh, and not to mention the fact that the calloc itself took about as long as the entire longest program run above!
Still convinced? Maybe your machines are different, I'm sure they are, but mine (and probably the reset of the Linux world running stock Ubuntu) work badly with calloc.
Oh, and the one that should arguably do the absolute worst, malloc + memset + the subsequent reset:
Not even 2 seconds behind, and there's about that variation run to run with this stuff. Yep, calloc...certainly worth it on my box .
I am writing a shell script with 2 run time arguments. During the execution if i got any error, then it needs to redirected to a error file and in console. Also both error and output to be redirected to a log file. But i am facing the below error.
#! /bin/sh
errExit ()
{
errMsg=`cat... (1 Reply)
Dear All,
I am trying to compile OpenFOAM-1.7.x in RHEL. I could not able to compile some of the applications because of libc version issues.
It is saying
undefined reference to memcpy@GLIBC_2.14
Can anybody look into it?
Thanks & Regards,
linuxUser_ (3 Replies)
I have two servers with a fresh install of Solaris 11, and having problems when doing rpcinfo between them. There is no firewall involved, so everything should theoretically be getting through. Does anyone have any ideas? I did a lot of Google searches, and haven't found a working solution yet.
... (2 Replies)
im kinda new to shell scripting so i need some help
i try to run this script and get the error code
> 5 ")syntax error: operand expected (error token is "
the code for the script is
#!/bin/sh
#
# script to see if the given value is correct
#
# Define errors
ER_AF=86 # Var is... (4 Replies)
Hello everybody,
I'm coding a test program for ARP protocol, and i don't know why i'm getting a SIGSEGV, i traced it with gdb and it says it's due to the memcpy function from /lib/libc.so.6.
Program received signal SIGSEGV, Segmentation fault.
0xb7e9e327 in memcpy () from /lib/libc.so.6
This... (5 Replies)
--------------------------------------------------------------------------------
Hello, help me please.
I am trying to create a mksysb bakup using nim. I am geting this error, how to correct it ? :
Command : failed stdout: yes stderr: no... (9 Replies)
Hi ,
I am having records in a file like
00412772784705041008FRUITFUL STRWBRRY
00412772784703041008FRUITFUL STRWBERE
00000570632801448078 X
i have declared a structure like
typedef struct {
char Uname;
char Pname;
... (4 Replies)
hi there
i write one awk script file in shell programing
the code is related to dd/mm/yy to month, day year format
but i get an error
please can anybody help me out in this problem ??????
i give my code here including error
awk `
# date-month -- convert mm/dd/yy to month day,... (2 Replies)
Hi
Having a lil trouble with a rather simple application I'm writing. It so happens that I have to copy some data using memcpy() and so far I've been doing just fine compiling it with VC.Net and running it on Windows XP. Now I'm trying to port the thing to Solaris (which shouldn't really be too... (3 Replies)
Hey guys, need some help. Running AIX Version 5.2 and one of our cron jobs is writing errors to a log file. Any ideas on the following error message.
Error: Internal system error: Unable to initialize standard output file
I'm guessing more info might be needed, so let me know.
Thanks (2 Replies)