Questions about mmap and shm

12-02-2009

Registered User

3, 0

Join Date: Dec 2009

Last Activity: 11 December 2009, 7:09 AM EST

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

Questions about mmap and shm_open

Hello everyone! I have some questions about mmap and shm_open, and thanks in advance for helping!

I'm involved in a project that several independent processes need to access a huge file (about 3.5G) in random pattern. The access frequency is high, so fseek and fread might be a bad choice for performance consider. My requirements are: 1) All file data are better to be in memory before any process accesses. 2) Also, assume the file is immutable, it is better to have only one copy of the data in memory in all time.

I can come up two different approaches. Approach One, every process mmap the file with MAP_SHARED flag. As mmap turns the file data into memory mapping, the file data accesses can be considered as memory accesses. However, the first time a specific location of the memory mapping is read, the kernel still goes to read the file, thus have some performance issue.

Approach Two, start a "host" process before all other processes, which creates a shared memory objects using shm_open and copy the file data to it. Subsequent process mmap the shm with MAP_SHARED flag. Since all data are available in memory, this approach should have better performance and scalability.

I did some comparisons between theses two plus the direct file operations (name Approach Three, without O_SYNC flag), and the results confuse me a lot (sorry for omitting the result data, I want to keep the thread clean to read). Note that the Approach Two result does not include the object creation time.

For sequential read, usually Approach Three is the fastest, and Approach Two is the slowest. Although the difference between three is tiny, it shows that the access through mmap + shm has significant overhead.

For random read (each read 192 bytes), Approach Two is often to best, and Approach Three is very bad as expected. However, Approach One is never far behind, sometimes even better than Approach Two.

Now my questions:
1) Are my experiments enough to show that shm_open has high overhead, which cancels its advantage of copying from memory, not disk?

2) If 1) is true, there is no reason to use shared memory object ever, since it has initialization before use while direct mmap does not. What's the benefit of shm?

3) Obviously, file reading does not require large buffer (this is application specific), which gives it some scalability. Do mmap and mmap + shm_open use a lot of memory? If process A mmap the file, reads some bytes in the mapping area, does kernel copy the bytes to process A's private memory space? Does each process reserves some amount of memory for caching?

4) If the answer to 3) is false, what is the cost that mmap pays in exchange of much better performance? AFAIK, there is no free lunch.

5) If the answer to 3) is true, then each process still maintains some private copy for their own, which threatens the scalability. Then the shared memory object is not really singleton. Is there really singleton implementation in Linux? I know in Windows, there are APIs ReadProcessMemory and WriteProcessMemory that allows unrelated processes to read others' memory space.

Please correct me if I'm wrong. I really want to know how to implement this cool thing. Thanks!

CrendKing

View Public Profile for CrendKing

Find all posts by CrendKing

12-03-2009

Registered User

1,015, 157

Join Date: Jun 2009

Last Activity: 25 June 2018, 8:15 AM EDT

Posts: 1,015

Thanks Given: 3

Thanked 157 Times in 149 Posts

How much RAM do you have? If you don't have enough physical RAM, even if you copy the data into a shared memory buffer it will just get paged out anyway. So make sure you have enough RAM. Depending on what you're doing, that's probably mean you need at least 8 GB. At least.

Another option is to use open()/pread() if you're reading 192-byte chunks. Using fseek()/fread() is going to be horrible because each pair of calls actually causes 8 kB to be read (most likely) because of the buffering being done. You can change the buffer size using (IIRC...) setbuf() and/or setvbuf() to better match your IO pattern, though. If you have enough RAM on the box, the data in the file will tend to be cached anyway, so any steps you take to copy the data into memory won't really help much.

And do NOT use O_SYNC. That gets you nothing.

Another thing to consider is that mmap() will NOT actually put data into memory until you actually try to read it and the data is page-faulted in from disk. You can use mlock() to lock the mmap()'d file into memory, and to really see how fast you can go, you can actually read at least one byte from each mapped paged when you mmap() the file to force each page to be read in from disk before you start processing.

And FWIW, the best mmap() options are probably to open() the file in O_RDONLY mode, and use read-only permissions on the memory mapping itself, along with the MAP_SHARED option. Hopefully in that case each process will then be able to map the same physical RAM into its virtual address space and you'll only wind up with one copy of the file in physical RAM. You can observe the mmap() flags used when processes mapped shared object executable code to see how this works.

achenle

View Public Profile for achenle

Find all posts by achenle

12-03-2009

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Quote:

Originally Posted by CrendKing

I'm involved in a project that several independent processes need to access a huge file (about 3.5G) in random pattern. The access frequency is high, so fseek and fread might be a bad choice for performance consider. My requirements are: 1) All file data are better to be in memory before any process accesses. 2) Also, assume the file is immutable, it is better to have only one copy of the data in memory in all time.

I can come up two different approaches. Approach One, every process mmap the file with MAP_SHARED flag. As mmap turns the file data into memory mapping, the file data accesses can be considered as memory accesses. However, the first time a specific location of the memory mapping is read, the kernel still goes to read the file, thus have some performance issue.

There's no avoiding the fact that data needs to be read from disk sometime. If you know certain specific bits of it that will be accessed more frequently than the rest, you might streamline that a bit by locking those into memory in advance; you could also warn the kernel before you use it with posix_madvise if your system has it.

Quote:

Approach Two, start a "host" process before all other processes, which creates a shared memory objects using shm_open and copy the file data to it. Subsequent process mmap the shm with MAP_SHARED flag. Since all data are available in memory, this approach should have better performance and scalability.

I don't see how this would be an improvement. If you have enough memory, your kernel can cache it in the first place without all this duplication -- triplication, in this case, since your kernel will be trying to cache it despite you creating your own caches.

Quote:

For sequential read, usually Approach Three is the fastest, and Approach Two is the slowest. Although the difference between three is tiny, it shows that the access through mmap + shm has significant overhead.

Indeed, since you're using double the memory to do more or less the same thing.

Quote:

1) Are my experiments enough to show that shm_open has high overhead, which cancels its advantage of copying from memory, not disk?

It's not that shm_open has high overhead, it's that you're doing more work for little to no gain -- copying from disk cache to memory instead of using cached disk through mmap.

Quote:

4) If the answer to 3) is false, what is the cost that mmap pays in exchange of much better performance? AFAIK, there is no free lunch.

mmap can only deal with memory in blocks of page size -- that is to say, 4K for many systems. The minimum read you can do is therefore 4K. I think this limitation carries into disk cache in general though so it may not be relevant.

Get enough memory and most of what you need will stay in disk cache.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

12-03-2009

Registered User

3, 0

Join Date: Dec 2009

Last Activity: 11 December 2009, 7:09 AM EST

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

First of all, thank you for answers!

Quote:

Originally Posted by achenle

How much RAM do you have?

16G.

Quote:

Originally Posted by achenle

And FWIW, the best mmap() options are probably to open() the file in O_RDONLY mode, and use read-only permissions on the memory mapping itself, along with the MAP_SHARED option. Hopefully in that case each process will then be able to map the same physical RAM into its virtual address space and you'll only wind up with one copy of the file in physical RAM. You can observe the mmap() flags used when processes mapped shared object executable code to see how this works.

This is exactly what I used. So you mean mmap turns this to happen: multiple unrelated processes shared same physical memory regions. Then is there any way I can measure this kind of sharing? Obviously I cannot tell from the decrement of free memory size. Each process seems reserve some 191 megabytes for Data+Stack size.

Quote:

Originally Posted by Corona688

It's not that shm_open has high overhead, it's that you're doing more work for little to no gain -- copying from disk cache to memory instead of using cached disk through mmap.

For Approach One, data comes from mapped memory region, which eventually comes from disk, or disk cache. For Approach Two, data also comes from mapped memory regions, which eventually comes from the shared memory object created by the "host" process. Therefore I think the only difference is whether disk or disk cache is faster, or shared memory object is faster. My results say they are equal, which I expect not.

-----

My conclusion is, if unrelated processes need to share existing files, always use direct mmap with MAP_SHARED. If unrelated processes need to share memory for maybe IPC purpose, shm_open + mmap could have slightly better performance. Am I correct?

CrendKing

View Public Profile for CrendKing

Find all posts by CrendKing

12-03-2009

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Quote:

Originally Posted by CrendKing

Therefore I think the only difference is whether disk or disk cache is faster, or shared memory object is faster. My results say they are equal, which I expect not.

Well, why shouldn't they be equal? Ignoring the slight overhead of loading your memory segments in the first place, they're both RAM.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

12-03-2009

Registered User

3, 0

Join Date: Dec 2009

Last Activity: 11 December 2009, 7:09 AM EST

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

I totally understand why disk CACHE = shared memory object. I do not understand why the disk seek time + disk read time = shared memory object for the first time, as the workload is random read. What does mmap internally do to have this significant gain?

Thanks.

CrendKing

View Public Profile for CrendKing

Find all posts by CrendKing

12-05-2009

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

I have no idea what you're saying now. Except that, assuming enough memory, it gets read only once. If you want it all to be read in advance before the process begins operating on it, under Linux 2.6.x, mmap has the MAP_POPULATE flag.

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

Programming

Questions about mmap and shm_open

10 More Discussions You Might Find Interesting

1. BSD

Mmap source

Discussion started by: dcicc

2. Emergency UNIX and Linux Support

mmap

Discussion started by: xerox

3. Programming

mmap

Discussion started by: andrew.paul

4. Homework & Coursework Questions

mmap

Discussion started by: gokult

5. Programming

mmap()

Discussion started by: gokult

6. UNIX for Dummies Questions & Answers

mmap()

Discussion started by: gokult

7. Solaris

mmap() on 64 bit m/c

Discussion started by: vin_pll

8. UNIX for Advanced & Expert Users

mmap and select

Discussion started by: Hitori

9. HP-UX

mmap failed

Discussion started by: scotbuff

10. Filesystems, Disks and Memory

mmap

Discussion started by: gusm