WGET cycling on an updating page Post: 302421446

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Power Cycling

Hello Friends , I have been reading some of the Sys Admin notes when i came across a term "Power Cycling" Can anybody please explain what this means Thank You

2. Shell Programming and Scripting

How to get the page size (of a url) using wget

Hi , I am trying to get page size of a url(e.g.,www.example.com) using wget command.Any thoughts what are the parameters i need to send with wget to get the size alone? Regards, Raj

3. Shell Programming and Scripting

awk updating one file with another, comparing, updating

Hello, I read and search through this wonderful forum and tried different approaches but it seems I lack some knowledge and neurones ^^ Here is what I'm trying to achieve : file1: test filea 3495; test fileb 4578; test filec 7689; test filey 9978; test filez 12300; file2: test filea...

4. Shell Programming and Scripting

KSH switches editin modes when cycling through history. Why?

Hello all, Working in KSH using Solaris, the default editor is VIM. So, per session, I run a small rc script which calls export editor=emacs This works for commands at the prompt. But if I cycle through command history (Using the up arrow) the command line editor defaults to VIM. How can I...

5. UNIX for Dummies Questions & Answers

wget with semicolon in page name

So, I'd like to wget a webpage, as its not going to stick around forever - but the problem is the webpage has a semicolon in it. wget http://example.com/stuff/asdf;asdf obviously doesn't get the right webpage. Any good way around this?

6. Shell Programming and Scripting

Print multiple copies page by page using lp command

Hi I have a pdf file that is being generated using the rwrun command in the shell script. I then have the lp command in the shell script to print the same pdf file. Suppose there are 4 pages in the pdf file , I need to print 2 copies of the first page, 2 copies of the second page , then 2...

7. Shell Programming and Scripting

Wget and single page

Good evening to all!! I'm trying to become familiar with wget. I would like to download a page from Wikipedia with all images and CSSs but without going down to all links present in the page. It should be named index.html. I would like also to save it to /mnt/us inside a new folder. This is...

8. Shell Programming and Scripting

Random web page download wget script

Hi, I've been attempting to create a script that downloads web pages at random intervals to mimic typical user usage. However I'm struggling to link $url to the URL list and thus wget complains of a missing URL. Any ideas? Thanks #!/bin/sh #URL List url1="http://www.bbc.co.uk"...

9. Hardware

USB power cycling poblem on RPI

I am having a problem with cycling USB bus power on the RPI B+ (3.18.7+). Each time I power the USB bus off and on, a device plugged into it gets a higher Device number, and eventually the bus crashes (does not enumerate new devices any more) As a demonstration, I wrote the python script...

10. Shell Programming and Scripting

Wget - working in browser but cannot download from wget

Hi, I need to download a zip file from my the below US govt link. https://www.sam.gov/SAMPortal/extractfiledownload?role=WW&version=SAM&filename=SAM_PUBLIC_MONTHLY_20160207.ZIP I only have wget utility installed on the server. When I use the below command, I am getting error 403...

LEARN ABOUT DEBIAN

httpindex

httpindex(1)						      General Commands Manual						      httpindex(1)

NAME

       httpindex - HTTP front-end for SWISH++ indexer

SYNOPSIS

       wget [ options ] URL...	2>&1 | httpindex [ options ]

DESCRIPTION

       httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1).  The files (in a copy of the remote direc-
       tory structure) can be kept, deleted, or replaced with their descriptions after indexing.

OPTIONS

   wget Options
       The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and  -w.	 (See  the
       EXAMPLE.)

   httpindex Options
       httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V.

       The following options are unique to httpindex:

       -d     Replace the text of local copies of retrieved files with their descriptions after they have been indexed.  This is useful to display
	      file descriptions in search results without having to have complete copies of the remote files thus saving filesystem  space.   (See
	      the extract_description() function in WWW(3) for details about how descriptions are extracted.)

       -D     Delete  the  local copies of retrieved files after they have been indexed.  This prevents your local filesystem from filling up with
	      copies of remote files.

EXAMPLE

       To index all HTML and text files on a remote web server keeping descriptions locally:

	    wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 |
	    httpindex -d -e'html:*.html,text:*.txt'

       Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex.

EXIT STATUS

       Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise.

CAVEATS

       In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options  (because  the  Perl
       script uses the standard GetOpt::Std package for processing command-line options that doesn't).	The last of any of those options ``wins.''

       The work-around is to use multiple values for those options seperated by commas to a single one of those options.  For example, if you want
       to do:

	    httpindex -e'html:*.html' -e'text:*.txt'

       do this instead:

	    httpindex -e'html:*.html,text:*.txt'

SEE ALSO

       index++(1), wget(1), WWW(3)

AUTHOR

       Paul J. Lucas <pauljlucas@mac.com>

SWISH++ 							  August 2, 2005						      httpindex(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Power Cycling

Discussion started by: DPAI

2. Shell Programming and Scripting

How to get the page size (of a url) using wget

Discussion started by: rajbal

3. Shell Programming and Scripting

awk updating one file with another, comparing, updating

Discussion started by: mecano

4. Shell Programming and Scripting

KSH switches editin modes when cycling through history. Why?

Discussion started by: eggmatters