I am experiencing an issue while downloading a few pages using wget. All of them work without a problem except one which is a page that does a tail on the log and as a result is constantly getting updated.
wget here seems to run endlessly and needs to be manually killed. I wanted to check if there was something that could be done to prevent this? I am currently letting it run for a specified number of seconds.
Hello Friends ,
I have been reading some of the Sys Admin notes when i came across a term "Power Cycling"
Can anybody please explain what this means
Thank You (1 Reply)
Hi ,
I am trying to get page size of a url(e.g.,www.example.com) using wget command.Any thoughts what are the parameters i need to send with wget to get the size alone?
Regards,
Raj (1 Reply)
Hello,
I read and search through this wonderful forum and tried different approaches but it seems I lack some knowledge and neurones ^^
Here is what I'm trying to achieve :
file1:
test filea 3495;
test fileb 4578;
test filec 7689;
test filey 9978;
test filez 12300;
file2:
test filea... (11 Replies)
Hello all,
Working in KSH using Solaris, the default editor is VIM. So, per session, I run a small rc script which calls
export editor=emacs
This works for commands at the prompt. But if I cycle through command history (Using the up arrow) the command line editor defaults to VIM. How can I... (2 Replies)
So, I'd like to wget a webpage, as its not going to stick around forever - but the problem is the webpage has a semicolon in it.
wget http://example.com/stuff/asdf;asdf obviously doesn't get the right webpage.
Any good way around this? (2 Replies)
Hi
I have a pdf file that is being generated using the rwrun command in the shell script.
I then have the lp command in the shell script to print the same pdf file.
Suppose there are 4 pages in the pdf file , I need to print 2 copies of the first page, 2 copies of the second page , then 2... (7 Replies)
Good evening to all!!
I'm trying to become familiar with wget.
I would like to download a page from Wikipedia with all images and CSSs but without going down to all links present in the page. It should be named index.html.
I would like also to save it to /mnt/us inside a new folder.
This is... (5 Replies)
Hi,
I've been attempting to create a script that downloads web pages at random intervals to mimic typical user usage. However I'm struggling to link $url to the URL list and thus wget complains of a missing URL. Any ideas?
Thanks
#!/bin/sh
#URL List
url1="http://www.bbc.co.uk"... (14 Replies)
I am having a problem with cycling USB bus power on the RPI B+ (3.18.7+).
Each time I power the USB bus off and on, a device plugged into it gets a higher Device number, and eventually the bus crashes (does not enumerate new devices any more)
As a demonstration, I wrote the python script... (4 Replies)
Hi,
I need to download a zip file from my the below US govt link.
https://www.sam.gov/SAMPortal/extractfiledownload?role=WW&version=SAM&filename=SAM_PUBLIC_MONTHLY_20160207.ZIP
I only have wget utility installed on the server.
When I use the below command, I am getting error 403... (2 Replies)
Discussion started by: Prasannag87
2 Replies
LEARN ABOUT DEBIAN
httpindex
httpindex(1) General Commands Manual httpindex(1)NAME
httpindex - HTTP front-end for SWISH++ indexer
SYNOPSIS
wget [ options ] URL... 2>&1 | httpindex [ options ]
DESCRIPTION
httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1). The files (in a copy of the remote direc-
tory structure) can be kept, deleted, or replaced with their descriptions after indexing.
OPTIONS
wget Options
The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and -w. (See the
EXAMPLE.)
httpindex Options
httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V.
The following options are unique to httpindex:
-d Replace the text of local copies of retrieved files with their descriptions after they have been indexed. This is useful to display
file descriptions in search results without having to have complete copies of the remote files thus saving filesystem space. (See
the extract_description() function in WWW(3) for details about how descriptions are extracted.)
-D Delete the local copies of retrieved files after they have been indexed. This prevents your local filesystem from filling up with
copies of remote files.
EXAMPLE
To index all HTML and text files on a remote web server keeping descriptions locally:
wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 |
httpindex -d -e'html:*.html,text:*.txt'
Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex.
EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise.
CAVEATS
In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options (because the Perl
script uses the standard GetOpt::Std package for processing command-line options that doesn't). The last of any of those options ``wins.''
The work-around is to use multiple values for those options seperated by commas to a single one of those options. For example, if you want
to do:
httpindex -e'html:*.html' -e'text:*.txt'
do this instead:
httpindex -e'html:*.html,text:*.txt'
SEE ALSO
index++(1), wget(1), WWW(3)AUTHOR
Paul J. Lucas <pauljlucas@mac.com>
SWISH++ August 2, 2005 httpindex(1)