11-18-2012
RE: wget -i URLs.txt
Hi Corona688,
Thanks for your post. The membership site I belong to is resell-rights-weekly.com and I just login and click the links to download to my home computer. I want to bypass my home computer and copy the files for that week's downloads. server to server is much faster than me trying to DSL them down and back up to my server. The input file is necessary because new downloads are put on the site each week. I will then put the urls in URLs.txt before the wget, set up as a cron to run every Monday and bring the files over in a fraction of the time to copy. I had it working partially but could not remember the switches I set.
Here is my next try: -->> wget -i URLs.txt --post-data 'user=klondrie&password=XXXX' -o wgetlogfile.txt -c
What do you think? What would you change? This should be a piece of cake. I do not see a lot of security as I can login and click the links to download to my computer. Need them on my server though.
Any more help available?
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all,
Iam trying to sort the contents of the file based on the position of the file.
Example:
$cat sample.txt
0101020060731 ## Header record
1c1 Berger Awc ANP20070201301 4000.50
1c2 Bose W G ANP20070201609 6000.70
1c2 Andy CK ANP20070201230 28000.00... (3 Replies)
Discussion started by: ganapati
3 Replies
2. UNIX for Advanced & Expert Users
Hi,
I've tried to download from ftp sites by wget but it failed and says "Service unavailable" but when I use sftp in binary mode and use "get" command it works perfectly. What's the problem?
BTW: I tried both passive and active mode in wget.
thnx for ur help (9 Replies)
Discussion started by: mjdousti
9 Replies
3. Shell Programming and Scripting
Hi,
I want to download some patches from SUN by using a script and I am using "wget" as the utillity for this.
The website for downloading has a "https:" in its name as below
https://sunsolve.sun.com/private-cgi/pdownload.pl?target=${line}&method=h
and on running wget as below
wget... (1 Reply)
Discussion started by: max29583
1 Replies
4. Shell Programming and Scripting
Hi,
I need to basically get a list of all the tarballs located at uri
I am currently doing a wget on urito get the index.html page
Now this index page contains the list of uris that I want to use in my bash script.
can someone please guide me ,.
I am new to Linux and shell scripting.
... (5 Replies)
Discussion started by: mnanavati
5 Replies
5. UNIX for Dummies Questions & Answers
Hi, I'm trying to install some libraries, when running the makefile I get an error from the "wget --no check certificate option". I had a look help and the option wasn't listed. Anyone know what I'm missing. (0 Replies)
Discussion started by: davcra
0 Replies
6. UNIX for Dummies Questions & Answers
I have a diff command that does what I want but when comparing large text/log files, it uses up all the memory I have (sometimes over 8gig of memory)
diff file1.txt file2.txt | grep '^<'| awk '{$1="";print $0}' | sed 's/^ *//'
Is there a better more efficient way to find the lines in one file... (5 Replies)
Discussion started by: raptor25
5 Replies
7. Shell Programming and Scripting
Dear people, I got a problem with an scrip using wget to download pdf-files from an website which uses session-cookies.
Background: for university its quite nasty to look up weekly which new homeworks, papers etc. are available on the different sites of the universites chairs. So I wanted a... (1 Reply)
Discussion started by: jackomo
1 Replies
8. Shell Programming and Scripting
wget -i genedx.txt
The code above will download multiple pdf files from a site, but how can i download and convert these to .txt?
I have attached the master list (genedx.txt - which contains the url and file names)
as well as the two PDF's that are downloaded. I am trying to have those... (7 Replies)
Discussion started by: cmccabe
7 Replies
9. Proxy Server
I cannot download anything using wget in centos 6.5 and 7. But I can update yum etc.
# wget https://wordpress.org/latest.tar.gz
--2014-10-23 13:50:23-- https://wordpress.org/latest.tar.gz
Resolving wordpress.org... 66.155.40.249, 66.155.40.250
Connecting to wordpress.org|66.155.40.249|:443...... (3 Replies)
Discussion started by: nirosha
3 Replies
snarf(1) General Commands Manual snarf(1)
NAME
snarf - Simple Non-interactive All-purpose Resource Fetcher
SYNOPSIS
snarf [-avqprzm] URL [outfile] ...
DESCRIPTION
Retrieves data from a variety of protocols, namely http, ftp, and gopher.
USAGE
snarf is invoked with any number of URLs and outfiles. If an outfile is not specified, snarf preserves the remote file name when saving.
For example, snarf http://foo.bar.com/images/face.gif will retrieve the file ``face.gif'' to the local system. In the event that there is
no filename (the url ends in a slash), the data is retrieved and stored in the file index.html for http URLs, ftpindex.txt for ftp URLs, or
gopherindex.txt for gopher URLs.
Using a dash, "-", as the outfile causes snarf to send its output to stdout rather than a file.
To log in to an ftp server or website that requires a username and password, use the syntax http://username:password@site.com/. If you omit
the password, you will be prompted for it.
Snarf has a built-in option to download the latest version of itself; simply run snarf LATEST.
OPTIONS
-a Causes snarf to use "active" ftp. By default, snarf uses passive ftp, and, if the server does not support it, falls back to active
ftp. Using the -a option will avoid the initial passive attempt.
-r Resumes an interrupted ftp or http transfer by checking if there is a local file with the same name as the remote file, and starting
the transfer at the end of the local file and continuing until finished. This option only works with HTTP servers that understand
HTTP/1.1 and ftp servers that support the REST command. snarf uses this option automatically if the outfile already exists.
-n Don't resume; ignore the outfile if it exists and re-transfer it in its entirety.
-q Don't print progress bars.
-p Forces printing of progress bars. Snarf has a compile-time option for whether progress bars print by default or not. The -p option
overrides the -q option. In addition, if progress bars are enabled by default, snarf suppresses them when standard output is not a
terminal. Using -p will override this behavior.
-v Prints all messages that come from the server to stderr.
-z Send a user-agent string similar to what Netscape Navigator 4.0 uses.
-m Send a user-agent string similar to what Microsoft Internet Explorer uses.
Each option only affects the URL that immediately follows it. To have an option affect all URLs that follow it, use an uppercase letter for
the option, e.g. -Q instead of -q.
ENVIRONMENT
Snarf checks several environment variables when deciding what to use for a proxy. It checks a service-specific variable first, then
SNARF_PROXY, then PROXY.
The service-specific variables are HTTP_PROXY, FTP_PROXY, and GOPHER_PROXY.
Snarf also checks the SNARF_HTTP_USER_AGENT environment variable and will use it when reporting its user-agent string to an HTTP server. In
the same spirit, it also uses the SNARF_HTTP_REFERER environment variable to spoof a Referer to the web server.
BUGS
Bugs? What bugs? If you find 'em, report 'em.
AUTHOR
Copyright (C) 2000 Zachary Beane (xach@xach.com)
17 Jun 2000 snarf(1)