Memory Trace Project


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Memory Trace Project
# 1  
Old 08-26-2011
Java Memory Trace Project

Hi guys,

I hope to get your valuable inputs to this pet project of mine, please do feel free to mention your ideas, suggestions and recommendations for the same. This is my personal project without any academic monitoring so I am definitely looking for some guidance from your experience.

I've collected a huge number of memory traces almost 10 GB of data. These memory traces were gathered from a set of servers, desktops, and laptops in a university CS Department. Each trace file contains a list of hashes representing the contents of the machine's memory, as well as some meta information about the running processes and OS type.

The traces have been grouped by type and date. Traces were recorded approximately every 30 minutes, although if machines were turned off or away from an internet connection for a long period, no traces were acquired. Each trace file is split into two portions. The top segment is ASCII text containing the system meta data about operating system type and a list of running processes. This is followed by binary data containing the list of hashes generated for each page in the system. Hashes are stored as consecutive 32bit values. There is a simple tool called "traceReader" for extracting the hashes from a trace file. This takes as an argument the file to be parsed, and will output the hash list as a series of integer values. If you would like to compare to traces to estimate the amount of sharing between them, you could run:

Code:
./traceReader trace-x.dat > trace-all 
./traceReader trace-y.dat >> trace-all 
cat trace-all | sort | uniq -c

This will tell you the number of times that each hash occurs in the system.

Now my idea is to take the trace for every interval (every 30 mins) for each of the systems and find the frequency of each memory hash. I then plan to collect the highest frequencies (hashes maximally occurring) of the entire hour (60 mins) and then divide the memory into 'k' different patterns based on the count of these frequencies. Like for instance say hashes 14F430C8 ,1550068, 15AD480A, 161384B6, 16985213, 17CA274B, 18E5F038 and 1A3329 have the highest frequencies then I might divide the memory into 8 patterns (k=8). I plan to use the Approximate Nearest neighbor algorithm (ANN) for this division. In ANN one needs to provide a set of query points, data points and dimensions. I guess in my case my query points can be all the remaining hashes other than the highest frequency ones, the data points are all the hashes for the hour and dimension can be 1. I can thus formulate the memory patterns for every hour, I then plan to formulate memory patterns for every 3 hrs, 6 hrs, 12 hrs and finally all the 24 hrs. Armed with these statistics, I plan to compare the patterns based on the time of the day. I hope to provide certain overlap with the patterns and create what I call as "heat zones" for memory based on the time of the day and finally come up with a suitable report for the same.

The entire objective of this project is to provide a sort of relation between the memory page access and the interval of time of the day. So for specific intervals there are certain memory "heat zones". I understand that these "heat zones" might change and may not be consistent with every system and user. The study here intends to only establish this relationship and doesn't do any kind of qualitative or quantitative analysis of these heat zones per system and user. The above can be considered to be an extension of this work.

Please feel free to comment and suggest for any new insights

Last edited by radoulov; 08-26-2011 at 05:58 AM.. Reason: Code tags.
# 2  
Old 08-30-2011
As pointed out there are two aspects to this project.

1. Find out about the processes running most frequently at a particular time interval on different systems (this may be an easier option) 2. Go deeper to the physical memory(PM) trace and find the relationship between the PM addresses and most frequent access per universal time clock per system.

I understand that with address space randomized mappings and with different systems running different processes it might be very hard to find any suitable pattern emerging from this study. But as most of us know that identical systems belonging in a particular network and during a time frame might end up accessing similar PM blocks. (A block here being groups of pages) I intend to find if there is any kind of correlation between this time frame and the access. According to the working set model of a system, there exits a temporal and spatial locality of memory page access and hence we end up using the appropriate page replacement algorithms. Now I intent to see if this same analogy can be applied to the entire memory address space for access. I mean if there exists some sort of a pattern emerging for physical memory access based on time and space.

I hope to know if there has been any similar work done before with memory traces or if there are any other areas which I need to look into before I can begin this study.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

FINDING DUPLICATE PROJECT ( directory project )

I have a project tree like that. after running find command with the -no -empty option, i am able to have a list of non empty directory DO_MY_SEARCH="find . -type d -not -empty -print0" MY_EXCLUDE_DIR1=" -e NOT_IN_USE -e RTMAP -e NOT_USEFULL " echo " " > $MY_TEMP_RESULT_1 while... (2 Replies)
Discussion started by: jcdole
2 Replies

2. AIX

Trace su to root

Hi, is it possible to trace everything about user that changes from its own user to root user, failed and successful attempts (I would need user and IP address of user that was trying to do that)? I tried adding auth.notice and auth.info in syslog.conf but it only tracks user withoud IP... (6 Replies)
Discussion started by: sprehodec
6 Replies

3. News, Links, Events and Announcements

A new project was posted on The UNIX and Linux Forums project board.

A new project was posted on your project board. Project title: Bash Shell Tutoring Estimated Budget: $50/hr Start date: Immediately Required skills: Linux, Bash, Shell, UNIX I work as a datawarehouse designer and developer. Although I usually stick to the role of an analyst,... (0 Replies)
Discussion started by: Neo
0 Replies

4. Solaris

what is the use of /etc/project file and project administration commands?

i have two doubts.. 1. what is the use /etc/project file. i renamed this file and when i tried to switch user or login with some user account the login was happening slowly. but when i renamed it to original name it was working fine... why so? 2. unix already has useradd and grouadd for... (4 Replies)
Discussion started by: chidori
4 Replies

5. AIX

How to trace cpu/memory usage for a process

I don't know when the process will start and end, I need write a script to trace it's cpu/memory usage when it is runing. How to write this script? (2 Replies)
Discussion started by: rainbow_bean
2 Replies

6. Solaris

Log Trace

Hi I would like to display only error messages from my log files while monotring application on my solaris box using tail command. Is there other way we can monitor please let me know? In general # tail -f "xyz.log' ---> this will display current activity of the logs, instead i would like... (4 Replies)
Discussion started by: gkrishnag
4 Replies

7. Solaris

SSH doesn't pick up user's project from /etc/project

We have a system running ssh. When a user logs in, they do not get the project they are assigned to (they run under "system"). I verify the project using the command "ps -e -o user,pid,ppid,args,project". If you do a "su - username", the user does get the project they are assigned to (and all... (2 Replies)
Discussion started by: kurgan
2 Replies

8. HP-UX

how to trace the logs

Hi, Last day, In one of our unix boxes there was an issue wherein few of the directory structures were missing / got deleted. Is there any way by which we can find how it happened, I mean by going through syslog / which user had run what command? Thanks for your help (3 Replies)
Discussion started by: vivek_damodaran
3 Replies

9. UNIX for Advanced & Expert Users

Trace connections

In my organization in order for anyone to go to any Unix server they have to go through "SERVER A" and login as themselves. Then people are free to go enywhere they please. For example: SERVER A, loggs in as himself telnets to SERVER B, loggs in as guest telnets to SERVER C, loggs in as... (8 Replies)
Discussion started by: jraitsev
8 Replies
Login or Register to Ask a Question