"Elements per page"... seeking ideas...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting "Elements per page"... seeking ideas...
# 1  
Old 02-18-2011
"Elements per page"... seeking ideas...

I work for a web hosting company uses Apache. We like to come up with composite models of what our customers do so that we can tailor our servers to what they need. One question we like to answer is, "For a given page downloaded from our customer's virtual server, what is the mean number of elements on that page?" An "element", roughly defined, is a transfer that appears in the Apache log in order to populate a page requested by a customer. The most common "element" type, of course, is images.

So, we'd like to have some reasonable way to determine the mean and dstandard deviation of number of "elements" per page. Possibly of help is that we are just building a general model, so some helpful assumptions may tend to even out over large numbers of log files. And I should mention that effectively our only source of information about this is the Apache logs from customers' virtual servers.

How would you approach this problem? We're certainly not helped by the fact that Apache logs really weren't designed for this. For that matter, neither was HTTP. Even so, without prejudicing your various brains toward one approach, here's a thought...

We know already that almost all pages served by our servers are transferred in less than 6 seconds. That's the HTML source page (or whatever dynamic page type it may be...) and the elements it calls. So, suppose we were to say that all log entries with a certain client IP address appearing within 6 seconds of each other are likely to be associated with a single customer page request. Then we could just record the IP and associated times and look for "clusters" of 6 seconds or less and count the number of elements in that grouping. But I'm uncertain of how to code that sort of "sliding window". I tend to do these things best in awk, and I'm not seeing how to do that.

Any thoughts? Like I said, that's just one approach, and certainly not necessarily the best. Thanks in advance.
# 2  
Old 02-20-2011
It looks you need some "Spider web code" or tool

Search in web, there are some free tools already.
# 3  
Old 02-21-2011
Wouldn't that act on the original source files? For our purposes, we effectively don't have access to those. We're limited to analyzing log files.
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies

2. Web Development

Quick Fix for Google Search Console "Page is not mobile friendly"

Over the past 10 plus years, we have countless posts where the user did not use CODE tags or they used ICODE tags incorrectly. This has has the results of this site penalized by Google for having pages which are "not mobile friendly". So, working quietly in the background, in the thankless... (0 Replies)
Discussion started by: Neo
0 Replies

3. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

4. Web Development

New "Page Not Found" (404) Page

Made some changes to the forum, so when a page is not found and generates a 404 error, the site redirects to "Today's Posts" page and added a "Not Found" message: <?php header('HTTP/1.0 404 Not Found', true, 404); header("Location: https://www.unix.com/search.php?do=getdaily&redirect=404");... (0 Replies)
Discussion started by: Neo
0 Replies

5. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

6. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

7. Shell Programming and Scripting

Compare file names and select correct elements to include in "for each loop"

Hi everyone, I`ll try to be most clear I can explaining my help request. I have 2 folders Folder A-->This folder receives files through FTP constantly Folder B-->The files from Folder A are unzipped and then processed in Folder B Sometimes Folder A doesn`t contain all... (2 Replies)
Discussion started by: cgkmal
2 Replies

8. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies
Login or Register to Ask a Question