Quantifying Counts, Costs, and Trends Accurately via Machine Learning

04-07-2008

Registered User

26,240, 27

Join Date: Sep 2000

Last Activity: 1 August 2008, 3:09 PM EDT

Posts: 26,240

Thanks Given: 0

Thanked 27 Times in 26 Posts

Quantifying Counts, Costs, and Trends Accurately via Machine Learning

HPL-2007-164(R.1) Quantifying Counts, Costs, and Trends Accurately via Machine Learning - Forman, George
Keyword(s): supervised machine learning, classification, prevalence estimation, class distribution estimation, cost quantification, quantification research methodology, minimizing training effort, detecting and tracking trends, concept drift, class imbalance, text mining
Abstract: In many business and science applications, it is important to track trends over historical data, for example, measuring the monthly prevalence of influenza incidents at a hospital. In situations where a machine learning classifier is needed to identify the relevant incidents from among all cases in ...
Full Report

More...

Linux Bot

View Public Profile for Linux Bot

Find all posts by Linux Bot

VW(1) User Commands VW(1) NAME
vw - Vowpal Wabbit -- fast online learning tool DESCRIPTION
VW options: -h [ --help ] Look here: http://hunch.net/~vw/ and click on Tutorial. --active_learning active learning mode --active_simulation active learning simulation mode --active_mellowness arg (=8) active learning mellowness parameter c_0. Default 8 --adaptive use adaptive, individual learning rates. --exact_adaptive_norm use a more expensive exact norm for adaptive learning rates. -a [ --audit ] print weights of features -b [ --bit_precision ] arg number of bits in the feature table --bfgs use bfgs optimization -c [ --cache ] Use a cache. The default is <data>.cache --cache_file arg The location(s) of cache_file. --compressed use gzip format whenever possible. If a cache file is being created, this option creates a compressed cache file. A mixture of raw-text & compressed inputs are supported with autodetection. --conjugate_gradient use conjugate gradient based optimization --nonormalize Do not normalize online updates --l1 arg (=0) l_1 lambda --l2 arg (=0) l_2 lambda -d [ --data ] arg Example Set --daemon persistent daemon mode on port 26542 --num_children arg (=10) number of children for persistent daemon mode --pid_file arg Write pid file in persistent daemon mode --decay_learning_rate arg (=1) Set Decay factor for learning_rate between passes --input_feature_regularizer arg Per feature regularization input file -f [ --final_regressor ] arg Final regressor --readable_model arg Output human-readable final regressor --hash arg how to hash the features. Available options: strings, all --hessian_on use second derivative in line search --version Version information --ignore arg ignore namespaces beginning with character <arg> --initial_weight arg (=0) Set all weights to an initial value of 1. -i [ --initial_regressor ] arg Initial regressor(s) --initial_pass_length arg (=18446744073709551615) initial number of examples per pass --initial_t arg (=1) initial t value --lda arg Run lda with <int> topics --lda_alpha arg (=0.100000001) Prior on sparsity of per-document topic weights --lda_rho arg (=0.100000001) Prior on sparsity of topic distributions --lda_D arg (=10000) Number of documents --minibatch arg (=1) Minibatch size, for LDA --span_server arg Location of server for setting up spanning tree --min_prediction arg Smallest prediction to output --max_prediction arg Largest prediction to output --mem arg (=15) memory in bfgs --noconstant Don't add a constant feature --noop do no learning --output_feature_regularizer_binary arg Per feature regularization output file --output_feature_regularizer_text arg Per feature regularization output file, in text --port arg port to listen on --power_t arg (=0.5) t power value -l [ --learning_rate ] arg (=10) Set Learning Rate --passes arg (=1) Number of Training Passes --termination arg (=0.00100000005) Termination threshold -p [ --predictions ] arg File to output predictions to -q [ --quadratic ] arg Create and use quadratic features --quiet Don't output diagnostics --rank arg (=0) rank for matrix factorization. --random_weights arg make initial weights random -r [ --raw_predictions ] arg File to output unnormalized predictions to --save_per_pass Save the model after every pass over data --sendto arg send examples to <host> -t [ --testonly ] Ignore label information and just test --loss_function arg (=squared) Specify the loss function to be used, uses squared by default. Currently available ones are squared, classic, hinge, logistic and quantile. --quantile_tau arg (=0.5) Parameter au associated with Quantile loss. Defaults to 0.5 --unique_id arg (=0) unique id used for cluster parallel jobs --total arg (=1) total number of nodes used in cluster parallel job --node arg (=0) node number in cluster parallel job --sort_features turn this on to disregard order in which features have been defined. This will lead to smaller cache sizes --ngram arg Generate N grams --skips arg Generate skips in N grams. This in conjunction with the ngram tag can be used to generate generalized n-skip-k-gram. vw 6.1 June 2012 VW(1)

UNIX and Linux RSS News

Quantifying Counts, Costs, and Trends Accurately via Machine Learning

6 More Discussions You Might Find Interesting

1. What is on Your Mind?

Google Trends: UNIX

Discussion started by: Neo

2. Web Development

Google Trends: react.js angular.js vue.js

Discussion started by: Neo

3. Infrastructure Monitoring

Event processing & machine learning in monitoring system

Discussion started by: pyalxx

4. Shell Programming and Scripting

How to get yesterday's date accurately using ksh?

Discussion started by: kchinnam

5. UNIX for Dummies Questions & Answers

learning UNIX on a Windows 2000 machine?

Discussion started by: wolfv

6. UNIX for Dummies Questions & Answers

How to accurately determine memory (RAM) information

Discussion started by: fiori_musicali

LEARN ABOUT DEBIAN

vw