Sponsored Content
Top Forums Shell Programming and Scripting Script to find the top 100 most popular pages Post 302517473 by jontjioe on Wednesday 27th of April 2011 12:37:39 AM
Old 04-27-2011
Script to find the top 100 most popular pages

Ok, this is really beyond my scripting skill level so I'm hoping somebody can help me out with this. I have a trace file in the following format:

Code:
<timestame> <devicenum> <sector address> <size in sectors> <0 or 1 (write or read)>

Here is what I need to do. I need to use the <sector address>, <size in sectors>, and the <0 or 1> fields.

I need to first check that the last field is a 0. If it is 0, I will need to check more fields on this line. If it is 1, I can skip it and go on to the next line.

So, if the last field is 0, I need to calculate the "pages" that are in this line. My requirements for pages are:
1) A page will be made up of 4 sectors.
2) A page must start off at a <sector number> that is evenly divisible by 4. If it does not, the <sector address> should rounded DOWN to the nearest sector number that is evenly divisible by 4.
3) Additionally, since a page is 4 sectors, if the <size in sectors> is less than a multiple of 4, it will need to be rounded UP to the closest multiple of 4. In other words, if there are only 3 sectors in the <size in sectors>, that still takes up at least 1 page.

What I want to do is find the top 100 pages that are the most popular in terms of writes (the last column is 0) in a trace file.

Here is a small example to illustrate:
Code:
123.257 0 12 6 0
456.579 0 13 8 0
458.780 0 2 1
500.579 0 5 9 0

For the 1st line, there will be 2 pages: the 1st page starts at 12 and the 2nd page starts at 16.

For the 2nd line, there will also be 2 pages: 1st page starts at 12 and 2nd page starts at 16. Note that both of these pages are actually the same pages from line 1.

The 3rd line is ignored because the last column is a 1 (read request).

For the 4th line, there will be 3 pages: the 1st page starts at 4, the 2nd page starts at 8, and the last page starts at 12. Note that the page starting at 12 is the same as the page in lines 1 and 2.

So for this small example, I want to have a printout similar to this. It should be sorted by the 2nd column in descending order so I can see the most popular files.

Code:
Page (starting sector #)   |   # of Writes
---------------------------------------------
12 3
16 2
4 1
8 1

And if I haven't already asked you for the world...the faster it runs, the better! I will have to run this on several million lines, so speed is important. I already have awk or perl installed so hopefully it will be one of those. Perl seems to be much faster.

Thank you so much in advance! You guys are awesome!

A longer example of the trace is below for testing:
Code:
5839.257 0 303884 7 0
5839.257 0 206070 6 0
5839.257 0 817773 6 0
5878.579 0 303891 7 0
5878.579 0 361650 6 0
5878.579 0 973353 6 0
5970.329 0 841315 24 0
6009.651 0 16601 1 0
6009.651 0 285602 1 0
6009.651 0 140952 6 0
6009.651 0 211173 6 0
6009.651 0 878233 2 0
6009.651 0 1002247 2 0
6009.651 0 725319 1 0
6016.204 0 206070 6 0
6016.204 0 817773 6 0
6016.204 0 760113 1 0
6022.758 0 303898 24 0
6042.419 0 303922 7 0

 

10 More Discussions You Might Find Interesting

1. AIX

How to find the top 6 users (which consume most space)?

Hi everybody, I want to know if there is any posibility to find out - on an AIX system - which are the the users who consume most space or at least a posibility to obtain a list with all the users and how much space are they consuming ? Trying to use du command was useless. Any idea?... (5 Replies)
Discussion started by: RebelDac
5 Replies

2. Shell Programming and Scripting

How to exclude top level directory with find?

I'm using bash on cygwin/windows. I'm trying to use find and exclude the directory /cygdrive/c/System\ Volume\ Information. When I try to use the command below I get the error "rm: cannot remove `/cygdrive/c/System Volume Information': Is a directory. Can someone tell me what I am doing... (3 Replies)
Discussion started by: siegfried
3 Replies

3. Cybersecurity

Recursively find and change Permissions on Man pages

Just joined after using the site as a guest.. (Very Good Stuff in here.. thanks folks.) I am in the process of hardening a Solaris 10 server using JASS. I also must use DISA Security Checklists (SRR) scripts to test for things that did not get hardened to DISA standards. One of the things... (5 Replies)
Discussion started by: altamaha
5 Replies

4. Shell Programming and Scripting

find top 100 files and move them

i have some 1000 files in my dir and i want to find top 100 files and move them to some other location: below the 2 commands i used, but it is not working ls -ltr | grep ^- | head -100 | xargs mv destination - _________>not working ls -ltr | grep ^- | head -100 | xargs mv {}... (3 Replies)
Discussion started by: ali560045
3 Replies

5. UNIX for Dummies Questions & Answers

find the size of a database by counting all the used pages

Hi all, I am looking to find the size of the database by counting all the used pages. 1. I have a file which reads like below 16384 4750850 32768 165 The first column is the pagesize and the second column is the number of pages... (6 Replies)
Discussion started by: family_guy
6 Replies

6. Programming

code to find the top of the stack, not able to figure it out

OFFSET=100; PAGESIZE=4096; int dummy_last; TOPSTACK = (caddr_t)(&dummy_last - OFFSET); TOPSTACK = (caddr_t)((unsigned long)TOPSTACK - ((unsigned long)TOPSTACK % PAGESIZE)); this i a code to find the top of the stack, but not able to figure it out. can... (2 Replies)
Discussion started by: holla4ni
2 Replies

7. Shell Programming and Scripting

find top 4 users currently logged on can i use grep

For the first 4 users only that are currently logged in output their effective user id. It's not important the order in which each logged in i just want to have the top 4. Same question as here...... (0 Replies)
Discussion started by: whyatepies
0 Replies

8. UNIX for Dummies Questions & Answers

how to find top 3 users currently logged on

For the first 3 users only that are currently logged in output their effective user id. thank you. (6 Replies)
Discussion started by: whyatepies
6 Replies

9. Red Hat

How to find memory taken by a process using top command?

I wanted to know how to find the memory taken by a process using top command. The output of the top command is as follows as an example: Mem: 13333364k total, 13238904k used, 94460k free, 623640k buffers Swap: 25165816k total, 112k used, 25165704k free, 4572904k cached PID USER ... (6 Replies)
Discussion started by: RHCE
6 Replies

10. UNIX for Dummies Questions & Answers

Find and cat top lines recursively

I have a folder structure with multiple sub directories MAIN FOLDER1 SUBFOLDER1 files...... FOLDER2 SUBFOLDER1 files...... etc and I want to find a way to create an output of every files first 20 lines. I've been searching and testing and failing. I can do it in a... (2 Replies)
Discussion started by: darbs121
2 Replies
All times are GMT -4. The time now is 03:10 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy