Memory fragmentation in a Linux settop box


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Memory fragmentation in a Linux settop box
# 15  
Old 09-17-2014
This:
lowmem_reserve[]: 0 0

and this:
active_anon:71912kB inactive_anon:192kB active_file:217020kB

Say the following to me. (I mention zfs because I'm most familiar with it. Not because
I think you have zfs filesystems.):

anon is heap in use. active file is file caching

anon requires contiguous blocks of a certain minimum new size. But file caching ate it all. By chewing up small chunks all over the place. And file caching is considered 'available space'. Making it available can take eons in computer time concepts. Think of having to take a scrambled deck of cards, ordering it, every time you have to make a new play in your card game. Things slow down in game play. A lot.

Or.

Think of a half full parking lot. You have 100 spaces available total. 50 are used. 50 are free. What is the largest block of free spaces? Knowing 50 are free does not mean 50 spaces all next to each other are available. The parking lot is owned by Mr Kernel, who thinks he has lots of room for more parking. But you by some arbitrary dictate require the place where you park 40 more cars all to be contiguous.

A lot of applications require allocating contiguous pages. But there can be small pieces between them. What happens when the process that has the small chunk goes away? You get a memory fragment. So, stingy memory programming in one app and major land grabs of memory in another often leads to memory fragmentation.

As achenle mentioned, some OS implementations of zfs do the stingy thing for caching, oracle does the county-wide land grab memory hog thing. The two do not play well together. Contrariwise, large amounts of RAM is more likely to trigger worse fragmentation. Because caching "thinks" the whole world is open. And pollutes larger areas with small chunks. Cleanup becomes more and more costly time-wise.

In your case it seems to be that the player is directly reading huge files (in large chunks) into RAM. Then file caching may be polluting free space by littering the landscape with small file cache chunks.

Last edited by jim mcnamara; 09-17-2014 at 10:03 AM..
This User Gave Thanks to jim mcnamara For This Post:
# 16  
Old 09-17-2014
Thank you, that sounds very reasonable.

I would think that the main process for playing back a file, or the timeshift function, are caching video data in large chunks as you describe. So if these processes were running alone, memory could probably be freed up quickly when needed, right?

Now my problem is identifying which process(es) are allocating the "small chunks" that cause the problem. Any idea? Is there maybe some tool for displaying which memory chunk has been allocated by which process?
# 17  
Old 09-17-2014
That depends on how you're doing your IO operations. How are your apps coded to do read and/or stream your files? Do you have control over your application IO? What filesystem(s) are you using for data?

Assuming you're streaming video files without much random searching, you should be using direct IO and bypassing the cache since it's extremely unlikely that the proper file data will be cached when you do any searching.

Direct IO will be faster and it won't fragment memory because you won't be using the page cache.

Caching of file data only helps when data can be held in memory long enough to allow multiple reads of the same data, or when write operations are small and/or slow enough to be effectively coalesced into a smaller number of write operations. Streaming or copying large files fits neither of those criteria.
# 18  
Old 09-17-2014
I agree that caching a video stream does not make sense during pure playback (as long as there is a buffer for "reading ahead", guaranteeing a continuous data stream if IO is briefly interrupted), however there could be situations like rewinding or jumping backwards where the cache may be useful, and I guess it has always been like this in enigma2 and may be difficult to change for practical or political reasons.

The filesystem is ext4. Your questions about how the apps are coded are correct of course, though there are so many system tasks and plugins active that an approach of analyzing each of them is unfortunately not practicable, so I was looking for a way of dumping information on which process has allocated which memory blocks, hoping to identify one worst process that could be re-written, using larger segments for lower fragmentation.
# 19  
Old 09-17-2014
Enigma2? Isn't that written all in Python?

You're trying to run a service that pushes the limits of the hardware you're running on that's written in a scripting language?

You can't run hardware at its design limits and ignore the design. Script languages ignore underlying details.
# 20  
Old 09-18-2014
The GUI of enigma2 as well as the plugins ('apps') are written in Python, but all the timing critical core stuff including video playback and timeshift are written in C++ so that, running on new chips, the CPU load is usually extremely low during such basic tasks, even when recording multiple services in parallel.

But you may be right that even a single Phyton script running on top of this may be allocating memory in small chunks, causing the problems.
# 21  
Old 09-18-2014
Do you know where the source code for those portions can be found? I've looked through the Enigma 2 source and it's pretty much all Python - as expected. Where is the source code for the core C++ elements?

I'm interested in seeing what it's doing.

Because I don't think the Python code would fragment physical memory like that unless you were swapping, which I don't think you are. What I strongly suspect is happening is akin to the Oracle/ZFS ARC wars on Solairs I described earlier in this thread where Oracle would request large pages, release them, the ZFS ARC would fragment the pages...

In your case, I suspect the C++ core is probably mmap() anonymous memory, requesting large pages, and using them for video processing buffers. And, unfortunately, then releasing them, allowing the page cache to fragment the large pages. Eventually you get to the point where there are no large pages available because of fragmentation.

Unfortunately, Linux now doesn't provide any way to limit page cache size, so all you can do is try to limit its growth by either not using it (preferred since you know your IO pattern), or through kernel parameter tuning.

You can try upping the value of vm.vfs_cache_pressure and see if that helps your issues.

Or you can figure out how to do the video processing with direct IO. Even if you take a few milliseconds or even tenths of a second hit in performance when searching or stepping through a video, that won't matter much because the person doing the searching or stepping likely won't notice that. They do notice the dropped frames from memory fragmentation, though.

In your opening post you said you would drop cache and improve performance. Don't try to keep it around now.

You can use an LD_PRELOAD library to intercept all open() calls and turn on direct IO for, say, *.mpg files, for example:

Code:
#include <strings.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>

/* can't include fcntl.h since that will conflict with the
   redefinition of open() (open() really isn't varargs - it's
   three args and uses the old C calling conventions and I
   am just too lazy to play around with varargs in this...) */
#ifdef __linux
/* weak symbol in libc that's another name for open() */
int __open( const char *file, int flags, mode_t mode );
/* we need O_DIRECT */
#define __USE_GNU
#define _FCNTL_H
#include <bits/fcntl.h>

/* Solaris version */
#else
int _open( const char *file, int flags, mode_t mode );
int directio( int fd, int flag );
#define DIRECTIO_ON (1)
#endif

static int useDirectIO( const char *file )
{
    char *pp;

    /* look for last "." char in filename */
    pp = strrchr( file, '.' );
    if ( NULL == pp )
    {
        return( 0 );
    }

    /* if there is a "." char, go to the next char */
    pp++;

    /* does it end with "mpg" (any case)? */
    if ( 0 == strcasecmp( pp, "mpg" ) )
    {
        return( 1 );
    }

    /* not an "mpg" file */
    return( 0 );
}

int open( const char *file, int flags, mode_t mode )
{
    int fd;

    /* check if direct IO needs to be used */
    int setFlag = useDirectIO( file );

#ifdef __linux
    if ( setFlag )
    {
        flags |= O_DIRECT;
    }
    fd = __open( file, flags, mode );

#else /* Solaris version */
    fd = _open( file, flags, mode );
    if ( setFlag )
    {
        directio( fd, DIRECTIO_ON );
    }
#endif

    return( fd );
}

gcc [-m64] -fPIC -shared -lc open.c -o libopen.so

When you start your process, just add
Code:
LD_PRELOAD=/path/to/lib/libopen.so

and every open() library call will be intercepted.

Then - if your application does IO in ways compatible with direct IO, you'll bypass the page cache. If your application does IO in ways that are not compatible with direct IO, it'll work very badly indeed.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. How to Post in the The UNIX and Linux Forums

Copying , renaming the file from windox box and ftp to Linux box

Hello my dear friends, Two file are auto generated from mon - fri at different directories on same windows box.Every day i have to copy the file, rename it (specific name)and ftp it to linux box specified directory. is it possible to automate this process,If yes this has to be done from windows... (1 Reply)
Discussion started by: umesh yadav
1 Replies

2. Red Hat

How to access redhat Linux box graphically from windows box?

Hi I have a linux box and need to access from windows graphically # uname -a Linux pc-l416116 2.6.18-155.el5 #1 SMP Fri Jun 19 17:06:47 EDT 2009 i686 i686 i386 GNU/Linux What components do I need to install on Linux and windows to do that? TIA (6 Replies)
Discussion started by: magnus29
6 Replies

3. UNIX for Dummies Questions & Answers

Mounting Linux box to Linux box

Hi, I've been able to mount my linux box to a windows machine, but I can't seem to mount my linux box to another linux box I have. (I know I could scp, but for other reasons I need to do it this way) Samba is installed. Here is an example where I mount to a Win machine.--> works fine mount... (12 Replies)
Discussion started by: jdilts
12 Replies

4. Solaris

Memory issue on solaris10 box

my system has 128G of installed memory. top, vmstat shows the system has just over 10G of free memory on the system. but as per prstat o/p the usage is just 50-55G is there anyway i can find which process/zone is using more memory ? System has 3 zones and all running application servers. ... (1 Reply)
Discussion started by: fugitive
1 Replies

5. Solaris

Memory of box

I have certain questions. 1) How can i see the memory of the unix box. 2) How can i see the size of the database on the box 3)can anyone suggest an article or tutorial that explains the concept of file systems and mount point in UNIX. 4)How can i see the dblink on the server I... (5 Replies)
Discussion started by: asalman.qazi
5 Replies

6. Linux

my box can't see full memory

Hi I'be recently installed Virtouzzo on Centos 5 on 16GB box , but the system could only see 4 GB of RAM, I installed the package kernel-PAE, but the virtuozzo kernel still can't see the full memory. even the kernel system can see 16GB of RAM is there any idea bout that ? Thanks (2 Replies)
Discussion started by: Raied
2 Replies

7. HP-UX

help me decipher how much memory on my box

hi, if I do top, I get Memory: 19277012K (5868296K) real, 33860312K (11294208K) virtual, 795392K free If I do swapinfo -tm I get: % swapinfo -tm Mb Mb Mb PCT TYPE AVAIL USED FREE USED dev 16384 0 16383 0% dev ... (3 Replies)
Discussion started by: JamesByars
3 Replies

8. Shell Programming and Scripting

Script to Reboot a linux box from a windows box

HI All, I need a script to reboot a linux box from a windows box. The script needs to run automatically whenever a sitescope alerts with an error message. Have searched for this in the forums, but could not get something relative. Pls. let me know the various alternatives we have to do... (2 Replies)
Discussion started by: Crazy_murli
2 Replies

9. UNIX for Dummies Questions & Answers

Fragmentation command in linux?

Hi, Please let me know more details on fragmentation in redhat linux and command to check fragmented files? Thanks, Bache Gowda (2 Replies)
Discussion started by: bache_gowda
2 Replies

10. Filesystems, Disks and Memory

Memory usage in the box

Hello: Environment is: Oracle 817 on IBM RS/6000 AIX 433 I have 4GB RAM on the box and Page/Swap is about the same. Presently I am using close to 1GB of RAM towards 5 instances of ORACLE production environments. How can I know, how much of memory/RAM is used for : Oracle Processes , I... (2 Replies)
Discussion started by: ST2000
2 Replies
Login or Register to Ask a Question