11-18-2012
RE: wget -i URLs.txt
Hi Corona688,
Thanks for your post. The membership site I belong to is resell-rights-weekly.com and I just login and click the links to download to my home computer. I want to bypass my home computer and copy the files for that week's downloads. server to server is much faster than me trying to DSL them down and back up to my server. The input file is necessary because new downloads are put on the site each week. I will then put the urls in URLs.txt before the wget, set up as a cron to run every Monday and bring the files over in a fraction of the time to copy. I had it working partially but could not remember the switches I set.
Here is my next try: -->> wget -i URLs.txt --post-data 'user=klondrie&password=XXXX' -o wgetlogfile.txt -c
What do you think? What would you change? This should be a piece of cake. I do not see a lot of security as I can login and click the links to download to my computer. Need them on my server though.
Any more help available?
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all,
Iam trying to sort the contents of the file based on the position of the file.
Example:
$cat sample.txt
0101020060731 ## Header record
1c1 Berger Awc ANP20070201301 4000.50
1c2 Bose W G ANP20070201609 6000.70
1c2 Andy CK ANP20070201230 28000.00... (3 Replies)
Discussion started by: ganapati
3 Replies
2. UNIX for Advanced & Expert Users
Hi,
I've tried to download from ftp sites by wget but it failed and says "Service unavailable" but when I use sftp in binary mode and use "get" command it works perfectly. What's the problem?
BTW: I tried both passive and active mode in wget.
thnx for ur help (9 Replies)
Discussion started by: mjdousti
9 Replies
3. Shell Programming and Scripting
Hi,
I want to download some patches from SUN by using a script and I am using "wget" as the utillity for this.
The website for downloading has a "https:" in its name as below
https://sunsolve.sun.com/private-cgi/pdownload.pl?target=${line}&method=h
and on running wget as below
wget... (1 Reply)
Discussion started by: max29583
1 Replies
4. Shell Programming and Scripting
Hi,
I need to basically get a list of all the tarballs located at uri
I am currently doing a wget on urito get the index.html page
Now this index page contains the list of uris that I want to use in my bash script.
can someone please guide me ,.
I am new to Linux and shell scripting.
... (5 Replies)
Discussion started by: mnanavati
5 Replies
5. UNIX for Dummies Questions & Answers
Hi, I'm trying to install some libraries, when running the makefile I get an error from the "wget --no check certificate option". I had a look help and the option wasn't listed. Anyone know what I'm missing. (0 Replies)
Discussion started by: davcra
0 Replies
6. UNIX for Dummies Questions & Answers
I have a diff command that does what I want but when comparing large text/log files, it uses up all the memory I have (sometimes over 8gig of memory)
diff file1.txt file2.txt | grep '^<'| awk '{$1="";print $0}' | sed 's/^ *//'
Is there a better more efficient way to find the lines in one file... (5 Replies)
Discussion started by: raptor25
5 Replies
7. Shell Programming and Scripting
Dear people, I got a problem with an scrip using wget to download pdf-files from an website which uses session-cookies.
Background: for university its quite nasty to look up weekly which new homeworks, papers etc. are available on the different sites of the universites chairs. So I wanted a... (1 Reply)
Discussion started by: jackomo
1 Replies
8. Shell Programming and Scripting
wget -i genedx.txt
The code above will download multiple pdf files from a site, but how can i download and convert these to .txt?
I have attached the master list (genedx.txt - which contains the url and file names)
as well as the two PDF's that are downloaded. I am trying to have those... (7 Replies)
Discussion started by: cmccabe
7 Replies
9. Proxy Server
I cannot download anything using wget in centos 6.5 and 7. But I can update yum etc.
# wget https://wordpress.org/latest.tar.gz
--2014-10-23 13:50:23-- https://wordpress.org/latest.tar.gz
Resolving wordpress.org... 66.155.40.249, 66.155.40.250
Connecting to wordpress.org|66.155.40.249|:443...... (3 Replies)
Discussion started by: nirosha
3 Replies
htdig(1) General Commands Manual htdig(1)
NAME
htpurge - remove unused odocuments from the database (general maintenance script)
SYNOPSIS
htpurge [-][-a][-c configfile][-u][-v]
DESCRIPTION
Htpurge functions to remove specified URLs from the databases as well as bad URLs, unretrieved URLs, obsolete documents, etc. It is recom-
mended that htpurge be run after htdig to clean out any documents of this sort.
OPTIONS
- Take URL list from standard input (rather than specified with -u). Format of input file is one URL per line. -a Use alternate work
files. Tells htpurge to append .work to database files, causing a second copy of the database to be built. This allows the original
files to be used by htsearch during the run.
-c configfile
Use the specified configfile instead of the default.
-u URL Add this URL to the list of documents to remove. Must be specified multiple times if more than one URL are to be removed. Should nor
be used together with -.
-v Verbose mode. This increases the verbosity of the program. Using more than 2 is probably only useful for debugging purposes. The
default verbose mode (using only one -v) gives a nice progress report while digging.
FILES
/etc/htdig/htdig.conf
The default configuration file.
SEE ALSO
Please refer to the HTML pages (in the htdig-doc package) /usr/share/doc/htdig-doc/html/index.html and the manual pages htdigconfig(8) ,
htdig(1) and htmerge(1) for a detailed description of ht://Dig and its commands.
AUTHOR
This manual page was written by Robert Ribnitz, based on the HTML documentation of ht://Dig.
January 2004 htdig(1)