Memory release latency issue

 
Thread Tools Search this Thread
Operating Systems Linux Red Hat Memory release latency issue
# 1  
Old 09-04-2014
Memory release latency issue

I have an application that routinely alloc() and realloc() gigabyte blocks of memory for image processing applications; specifically performing rotations of huge images, or creating/ deleting huge image buffers to contain multiple images. Immediately upon completion of an operation I call free() to release the memory.

I've noticed dramatic performance disparities depending upon the sequence that operations are performed. The first call to a function completes quickly, but subsequent calls can take up to 5X as long as the first; exact same code. All terminate normally, the issue is performance or lack of it.

It appears that after I free() a block of memory that I am using the system, for unknown reasons, does not make this resource immediately available again for an indeterminate period. I free the memory, but the system performs as if the memory is still in use. There is no logic issue of the memory being freed; the only path to a return is through the free() statement.

I'm a coder, not a systems expert. Any ideas out there? What is going on? Language is C/C++.

Many thanks in advance.

Imagtek
imagtek.com

---------- Post updated at 12:34 PM ---------- Previous update was at 12:32 PM ----------

The system is CentOS/64 bit, release 2.6.32-358.14.1.el6.x86_64

Last edited by imagtek; 09-04-2014 at 01:32 PM.. Reason: add information
# 2  
Old 09-04-2014
Likely the kernel is not optimized for free().
It is a complex task, and maybe blocking new allocations.
You can try an OS upgrade, with a higher kernel version.
Or try to optimize your code - call free() less often.
# 3  
Old 09-04-2014
In short, malloc() is the wrong tool for throwing around entire gigabytes of memory at once. You should cut out the middleman and use mmap().

The first time you request an entire gigabyte of memory, malloc() probably has to call brk to extend the heap segment. (This is a system memory call related to mmap.) This adds a vast new region of unused memory to the heap -- memory that's all guaranteed to hold nothing but ASCII NULLs. It just gives it straight to you and doesn't bother to clean it.

Then you free() and malloc() it again. Because it's been used, malloc() will memset() that entire gig of memory to NULL for you to make it "clean" again.

By using mmap() instead, you can let the OS do that as-needed instead of in one 1-gig write. mmap also has other useful features like file backing -- if all you're doing is dumping 5 gigs of file into memory, mmap can save you a ton of trouble and speed and RAM.

Or, if you went the other direction, you could just keep reusing the same block of memory around all the time without free()ing it.

Last edited by Corona688; 09-04-2014 at 03:56 PM..
# 4  
Old 09-04-2014
That's because free() does nothing other than make the memory you freed available for you by your next malloc() call.

Why are you using malloc() and free() over and over, anyway? Just malloc() (or mmap()) a few chunks that you know will be big enough and use the same ones over and over.
# 5  
Old 09-04-2014
Quote:
Originally Posted by imagtek
It appears that after I free() a block of memory that I am using the system, for unknown reasons, does not make this resource immediately available again for an indeterminate period. I free the memory, but the system performs as if the memory is still in use. There is no logic issue of the memory being freed; the only path to a return is through the free() statement.
Perhaps I misunderstood you before. So the problem isn't the speed of the free(), but the memory use?

It's like achenle says, it is in use. malloc() assumes if you've allocated it before, you're going to allocate it again, and keeps it in the pool for later. If you want control over when exactly it's released to the OS, you need mmap.
# 6  
Old 09-04-2014
Plus, if you repeatedly call malloc/free for varying very large sized chunks of memory, malloc will gladly fragment heap to the point where it becomes less efficient. This is due in part to the fact that some OS flavors may reclaim memory after a free call. Especially if there are other processes calling for memory chunks. Numa also plays into big chunk operations.

Several years ago we ran a test on a non-prod Solaris 10 box with 64GB of memory. We malloced one single giant chunk, never called malloc again. We reused the chunk over and over with varying sized buffers. By adding back in the malloc/free calls between every operation on new "new" chunk, the same test code ran about 15% slower and spent most of that extra time in kernel mode.

NUMA really slows down accessing large memory allocations because of locality issues. The system cannot relocate gigantic memory chunks to more convenient locations. Since you have a commodity cpu (multicoore x86 ) then NUMA is a concern.
You need to look into cpu affinity for threads.

If you are reading from and then writing to vastly distant memory chunks you need to be aware of the order of accessing neighboring memory rather than doing something like copying the contents of arr[0] to arr[2000000], then reading in arr[1000000]. Each one of those example actions can mean reloading an L2 cache - as an example. As it is nowadays, memory is about an order of magnitude or more slower than your cpus.

Edit: You really should consider this article:

http://www.akkadia.org/drepper/cpumemory.pdf

It is somewhat old, but still completely applicable.

Last edited by jim mcnamara; 09-04-2014 at 05:44 PM..
This User Gave Thanks to jim mcnamara For This Post:
# 7  
Old 09-05-2014
Thanks all for very informative replies. Memory allocation at the system level is more complex than I thought. I'll dig into the mmap() possibility. Part of my design-for-performance strategy working with huge images is to code low-level and as close to the system as possible, so it looks like more work to do there. As I said, first time through these algorithms fly, then its like they get stuck in the mud. Sometimes simply painting the screen hangs for seconds at a time. Always immediately after using/freeing massive blocks of memory.

I'll play around with some of these ideas and let you know what I find. I'm pushing my old 8 GB machine to its limits, maybe a bit past them, but that is what its for.

Thanks again for the valuable information.
imagtek
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Linux

Swap memory issue

Hi, In our production box i can see the Swap space using the below command free total used free shared buffers cached Mem: 65963232 41041084 24922148 0 877160 35936292 -/+ buffers/cache: 4227632 61735600 Swap: 4192880 ... (6 Replies)
Discussion started by: ratheeshjulk
6 Replies

2. Red Hat

Memory Issue

I could not find what is consuming the memory, generated DSET reports and NO hardware wise memory issue. 64 GB RAM on a server yet all I could see is a very limited memory available. I am not sure if I am reading this correct or not. I have used free -t -m and cat /proc/meminfo (results below)... (3 Replies)
Discussion started by: rsheikh01
3 Replies

3. AIX

Memory issue

I have a server with 300Gb allocated to it. Some times I observed in topas Comp% 73 and Non comp 35% and client is also 35% and my paging is showing 92%. If my physical memory utilized only 70% then why paging is so high. And what is relation between Comp, noncomp and client? If the memory... (1 Reply)
Discussion started by: powerAIX
1 Replies

4. Solaris

Zone memory issue

We have a zone configured in our X4600 machine with memory capped to 16GB. Most of the time zone is running with high physical memory utilization. It seems from "top" command shows that the command "kernel" is locks 15GB phy. memory and not using swap memory. Whenever we restart the application... (2 Replies)
Discussion started by: rock123
2 Replies

5. AIX

AIX memory issue

Currently server have load while there is no heavy things running, just oracle database/ application server oracle. I don't understand why server have heavy load, 22GB is under buffer, how to clean buffer/memory in AIX load averages: 9.42, 9.43, 9.68; 05:25:08 141 processes: 125 idle, 16... (12 Replies)
Discussion started by: learnbash
12 Replies

6. Solaris

Locked memory issue

One of our project has exceeded its assigned max-memory-locked by 3 times .. The said project is using around 9 gigs as described by rss parameter in prstat -J .. and the max-project-memory parameter has been defined as 3gigs .. is it normal or we are monitoring the project memory usage in wrong... (2 Replies)
Discussion started by: fugitive
2 Replies

7. Linux

Virtual Memory issue

Hi all, I was compiling my glibc 2.6.1 source files on a new kernel 2.66.22.6 and it seems that i am running into issues with the Virtual Memory. It displays the error message: virtual memory exhausted: Cannot allocate memory‏ I saw an article on how to adjust the parameters but i can't... (5 Replies)
Discussion started by: scriptingmani
5 Replies

8. AIX

Shared memory issue

Hi friends.. Help to solve this issue... Is there any parameter setting to control or limit the size of the shared memory a process can attach for the below specified environment? The man pages says it can attach upto segments of size 2GB. But when our process (which also connects to... (0 Replies)
Discussion started by: sdspawankumar
0 Replies

9. Linux

Memory issue while diff !!!

Hi All Any idea? why am I having this memory issue? perforce@ixca-pforce-bak:/home/p4-demo-root$ diff checkpoint_offline ../p4-demo-root2/checkpoint.1150 diff: memory exhausted Thanks a lot C Saha (0 Replies)
Discussion started by: csaha
0 Replies

10. Windows & DOS: Issues & Discussions

Memory Issue

Hi There, I have upgraded the DELL poweredge 2600 server memory from 2GB to 4GB. However, the memory only showed at 2GB of utilization. How to make sure that the server is full utilize of 4GB of memory. Is there the Virtual memory need to be reconfigure as this server is run on windows 2000 and... (2 Replies)
Discussion started by: vestro
2 Replies
Login or Register to Ask a Question