Sponsored Content
Full Discussion: Wget help
Top Forums UNIX for Dummies Questions & Answers Wget help Post 302894266 by sea on Monday 24th of March 2014 03:01:22 PM
Old 03-24-2014
How about crawling the whole server, retrieving all findings as a list only.
In the next step, parse the list for your files (zip,rar) and then download those findings to current dir.

eg something like:
Code:
wget -r --spider example.com > retrieved.txt
for FOUND in $(grep -e rar -e zip retrieved.txt);do
  wget -nc $FOUND > $(basename $FOUND)
done

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

wget -r

I have noticed a lot of expensive books appearing online so I have decided to copy them to CD. I was going to write a program in java to do this, but remembered that wget GNU program some of you guys were talking about. Instead of spending two hours or so writing a program to do this.... (1 Reply)
Discussion started by: photon
1 Replies

2. Shell Programming and Scripting

wget help

i am trying to ftp files/dirs with wget. i am having an issue where the path always takes me to my home dir even when i specify something else. For example: wget -m ftp://USER:PASS@IP_ADDRESS/Path/on/remote/box ...but if that path on the remote box isn't in my home dir it doesn't change to... (0 Replies)
Discussion started by: djembeplayer
0 Replies

3. Shell Programming and Scripting

Help with wget

Hi, i need temperature hourly from a web page Im using wget to get the web page. I would like to save the page downloaded in a file called page. I check the file everytime i run the wget function but its not saving but instead creates a wx.php file....Each time i run it...a new wx.php file is... (2 Replies)
Discussion started by: vadharah
2 Replies

4. Shell Programming and Scripting

wget

Hi I want to download some files using wget , and want to save in a specified directory. Is there any way to save it.Please suggest me. (1 Reply)
Discussion started by: mnmonu
1 Replies

5. Shell Programming and Scripting

wget help?

can someone please help in understanding this shell script? wget --progress=dot:mega --cut-dirs=4 -r -c -nH -np --reject index.html*,icons/*.gif \ http://*****.oz.xxxxx.com:<portnum>/omcsm/releases/dew/${UPGRADE_VERSION}/ (1 Reply)
Discussion started by: dnam9917
1 Replies

6. UNIX for Dummies Questions & Answers

Wget

...... (1 Reply)
Discussion started by: hoo
1 Replies

7. Shell Programming and Scripting

WGET help!

Hi Friends, I have an url like this https://www.unix.com/help/ In this help directory, I have more than 300 directories which contains file or files. So, the 300 directories are like this http://unix.com/help/ dir1 file1 dir2 file2 dir3 file3_1 file3_2... (4 Replies)
Discussion started by: jacobs.smith
4 Replies

8. Red Hat

Wget

If I run the following command wget -r --no-parent --reject "index.html*" 10.11.12.13/backups/ A local directory named 10.11.12.13/backups with the content of web site data is created. What I want to do is have the data placed in a local directory called $HOME/backups. Thanks for... (1 Reply)
Discussion started by: popeye
1 Replies

9. Shell Programming and Scripting

Wget and gz

Can wget be used to goto a site and piped into a .gz extrated command? wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37 | gunzip -d clinvar_20150603.vcf.gz (1 Reply)
Discussion started by: cmccabe
1 Replies

10. Shell Programming and Scripting

Wget - working in browser but cannot download from wget

Hi, I need to download a zip file from my the below US govt link. https://www.sam.gov/SAMPortal/extractfiledownload?role=WW&version=SAM&filename=SAM_PUBLIC_MONTHLY_20160207.ZIP I only have wget utility installed on the server. When I use the below command, I am getting error 403... (2 Replies)
Discussion started by: Prasannag87
2 Replies
httpindex(1)						      General Commands Manual						      httpindex(1)

NAME
httpindex - HTTP front-end for SWISH++ indexer SYNOPSIS
wget [ options ] URL... 2>&1 | httpindex [ options ] DESCRIPTION
httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1). The files (in a copy of the remote direc- tory structure) can be kept, deleted, or replaced with their descriptions after indexing. OPTIONS
wget Options The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and -w. (See the EXAMPLE.) httpindex Options httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V. The following options are unique to httpindex: -d Replace the text of local copies of retrieved files with their descriptions after they have been indexed. This is useful to display file descriptions in search results without having to have complete copies of the remote files thus saving filesystem space. (See the extract_description() function in WWW(3) for details about how descriptions are extracted.) -D Delete the local copies of retrieved files after they have been indexed. This prevents your local filesystem from filling up with copies of remote files. EXAMPLE
To index all HTML and text files on a remote web server keeping descriptions locally: wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 | httpindex -d -e'html:*.html,text:*.txt' Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex. EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise. CAVEATS
In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options (because the Perl script uses the standard GetOpt::Std package for processing command-line options that doesn't). The last of any of those options ``wins.'' The work-around is to use multiple values for those options seperated by commas to a single one of those options. For example, if you want to do: httpindex -e'html:*.html' -e'text:*.txt' do this instead: httpindex -e'html:*.html,text:*.txt' SEE ALSO
index++(1), wget(1), WWW(3) AUTHOR
Paul J. Lucas <pauljlucas@mac.com> SWISH++ August 2, 2005 httpindex(1)
All times are GMT -4. The time now is 01:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy