A Novel Traffic Analysis for Identifying Search Fields in the Long Tail of Web Sites

 
Thread Tools Search this Thread
Special Forums News, Links, Events and Announcements UNIX and Linux RSS News A Novel Traffic Analysis for Identifying Search Fields in the Long Tail of Web Sites
# 1  
Old 02-22-2010
A Novel Traffic Analysis for Identifying Search Fields in the Long Tail of Web Sites

HPL-2010-27 A Novel Traffic Analysis for Identifying Search Fields in the Long Tail of Web Sites - Forman, George; Kirshenbaum, Evan; Rajaram, Shyamsundar
Keyword(s): web data mining, clickstream analysis, machine learning classification, active learning
Abstract: Using a clickstream sample of 2 billion URLs from many thousand volunteer Web users, we wish to analyze typical usage of keyword searches across the Web. In order to do this, we need to be able to determine whether a given URL represents a keyword search and, if so, which field contains the query. A ...
Full Report

More...
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. What is on Your Mind?

Your Favorite Tech Support Web Sites and Why?

Where do you go to participate in technical discussions besides UNIX.COM and why? Personally, I do not really participate in other forums and discussion boards, but I do ask questions from time to time on Stack sites. The problem I have with Stack is that my questions are never answered on any... (30 Replies)
Discussion started by: Neo
30 Replies

2. Red Hat

Web sites

Hi, I can't view web portal in my intranet from linux RHE, and neither to web application. My network configuration /etc/sysconfig/network-scripts/fcfg-eth0 is ok, what is happen?, can you help me please. (2 Replies)
Discussion started by: xochitl
2 Replies

3. Shell Programming and Scripting

Identifying entries based on 2 fields in a string.

Hi Guys, I’m struggling to use two fields to do a duplicate/ unique by output. I want to look IP addresses assigned to more than one account during a given period in the logs. So duplicate IP and account > 1 then print all the logs for that IP. I have been Using AWK (just as its installed... (3 Replies)
Discussion started by: wabbit02
3 Replies

4. Shell Programming and Scripting

Identifying specific fields in a Row

Hi, I am new to UNIX. Can some one help me to solve the below. I have a requirement to to identify the specific fields in row and also some part of the field. In my file I have a record as sundra;10.44.48.65;10thstreet TCP packet out of state: First packet isn't SYN;telno:... (3 Replies)
Discussion started by: suneel.mekala
3 Replies

5. Web Development

How do you make web sites?

:confused: I've read how on some websites but I still don't get it. I need specific details. I want to make a website for my photography. Please help!:D (3 Replies)
Discussion started by: animelibara123
3 Replies

6. OS X (Apple)

Use UNIX to track web sites viewed?

I'm on OSX 10.4. I was wondering if you can use UNIX terminal to track what web sites have been viewed on this Mac... Thank you! (1 Reply)
Discussion started by: tracymanusa
1 Replies

7. Solaris

Identifying new fields of data

i have hundreds of lines of formatted data with 10 different fields per line. the data is refreshed every few minutes and some fields in some lines may reflect new data. i'm looking for a sample of code that help me to identify those new fields so that i can write them to a file to indicate that... (0 Replies)
Discussion started by: davels
0 Replies
Login or Register to Ask a Question
Bio::Tools::Run::Analysis::soap(3pm)			User Contributed Perl Documentation		      Bio::Tools::Run::Analysis::soap(3pm)

NAME
Bio::Tools::Run::Analysis::soap - A SOAP-based access to the analysis tools SYNOPSIS
Do not use this object directly, it is recommended to access it and use it through the "Bio::Tools::Run::Analysis" module: use Bio::Tools::Run::Analysis; my $tool = Bio::Tools::Run::Analysis->new(-access => 'soap', -name => 'seqret'); DESCRIPTION
This object allows to execute and to control a remote analysis tool (an application, a program) using the SOAP middleware, All its public methods are documented in the interface module "Bio::AnalysisI" and explained in tutorial available in the "analysis.pl" script. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: http://redmine.open-bio.org/projects/bioperl/ AUTHOR
Martin Senger (martin.senger@gmail.com) COPYRIGHT
Copyright (c) 2003, Martin Senger and EMBL-EBI. All Rights Reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. DISCLAIMER
This software is provided "as is" without warranty of any kind. SEE ALSO
o http://www.ebi.ac.uk/soaplab/Perl_Client.html BUGS AND LIMITATIONS
None known at the time of writing this. APPENDIX
Here is the rest of the object methods. Internal methods are preceded with an underscore _. _initialize Usage : my $tool = Bio::Tools::Run::Analysis->new(-access => 'soap', -name => 'seqret', ...); (_initialize is internally called from the 'new()' method) Returns : nothing interesting Args : This module recognises and uses following arguments: -location -name -httpproxy -timeout Additionally, the main module Bio::Tools::Run::Analysis recognises also: -access It populates calling object with the given arguments, and then - for some attributes and only if they are not yet populated - it assigns some default values. This is an actual new() method (except for the real object creation and its blessing which is done in the parent class Bio::Root::Root in method _create_object). Note that this method is called always as an object method (never as a class method) - and that the object who calls this method may already be partly initiated (from Bio::Tools::Run::Analysis::new method); so if you need to do some tricks with the 'class invocation' you need to change Bio::Analysis new method, not this one. -location A URL (also called an endpoint) defining where is located a Web Service representing this analysis tool. Default is "http://www.ebi.ac.uk/soaplab/services" (services running at European Bioinformatics Institute on top of most of EMBOSS analyses, and few others). For example, if you run your own Web Service using Java(TM) Apache Axis toolkit, the location might be something like "http://localhost:8080/axis/services". -name A name of a Web Service (also called a urn or a namespace). There is no default value (which usually means that this parameter is mandatory unless your -location parameter includes also a Web Service name). -destroy_on_exit => '0' Default value is '1' which means that all Bio::Tools::Run::Analysis::Job objects - when being finalised - will send a request to the remote Web Service to forget the results of these jobs. If you change it to '0' make sure that you know the job identification - otherwise you will not be able to re-established connection with it (later, when you use your script again). This can be done by calling method "id" on the job object (such object is returned by any of these methods: "create_job", "run", "wait_for"). -httpproxy In addition to the location parameter, you may need to specify also a location/URL of an HTTP proxy server (if your site requires one). The expected format is "http://server:port". There is no default value. -timeout For long(er) running jobs the HTTP connection may be time-outed. In order to avoid it (or, vice-versa, to call timeout sooner) you may specify "timeout" with the number of seconds the connection will be kept alive. Zero means to keep it alive forever. The default value is two minutes. is_binary Usage : if ($service->is_binary ('graph_result')) { ... } Returns : 1 or 0 Args : $name is a result name we are interested in VERSION and Revision Usage : print $Bio::Tools::Run::Analysis::soap::VERSION; print $Bio::Tools::Run::Analysis::soap::Revision; Defaults Usage : print $Bio::Tools::Run::Analysis::soap::DEFAULT_LOCATION; perl v5.12.3 2011-06-18 Bio::Tools::Run::Analysis::soap(3pm)