11-18-2012
Wget -i URLs.txt problem
Hi Everyone,
I have a problem with wget using an input file of URLs. When I execute this -> wget -i URLs.txt I get the login.php pages transferred but not the files I have in the URLs.txt file. I need to use the input file because it will have new products to download each week. I want my VA to fill that file with the URLs that we need to transfer to my server. There must be some login necessary as I must login through http with a user name and password. Is that what I need on the command line or is there a cookie type problem,. See, I can login and download the files just fine but want them on my server for faster transfer to me and wget is not working yet.
Any help is appreciated.
Thanks,
Keith
9 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all,
Iam trying to sort the contents of the file based on the position of the file.
Example:
$cat sample.txt
0101020060731 ## Header record
1c1 Berger Awc ANP20070201301 4000.50
1c2 Bose W G ANP20070201609 6000.70
1c2 Andy CK ANP20070201230 28000.00... (3 Replies)
Discussion started by: ganapati
3 Replies
2. UNIX for Advanced & Expert Users
Hi,
I've tried to download from ftp sites by wget but it failed and says "Service unavailable" but when I use sftp in binary mode and use "get" command it works perfectly. What's the problem?
BTW: I tried both passive and active mode in wget.
thnx for ur help (9 Replies)
Discussion started by: mjdousti
9 Replies
3. Shell Programming and Scripting
Hi,
I want to download some patches from SUN by using a script and I am using "wget" as the utillity for this.
The website for downloading has a "https:" in its name as below
https://sunsolve.sun.com/private-cgi/pdownload.pl?target=${line}&method=h
and on running wget as below
wget... (1 Reply)
Discussion started by: max29583
1 Replies
4. Shell Programming and Scripting
Hi,
I need to basically get a list of all the tarballs located at uri
I am currently doing a wget on urito get the index.html page
Now this index page contains the list of uris that I want to use in my bash script.
can someone please guide me ,.
I am new to Linux and shell scripting.
... (5 Replies)
Discussion started by: mnanavati
5 Replies
5. UNIX for Dummies Questions & Answers
Hi, I'm trying to install some libraries, when running the makefile I get an error from the "wget --no check certificate option". I had a look help and the option wasn't listed. Anyone know what I'm missing. (0 Replies)
Discussion started by: davcra
0 Replies
6. UNIX for Dummies Questions & Answers
I have a diff command that does what I want but when comparing large text/log files, it uses up all the memory I have (sometimes over 8gig of memory)
diff file1.txt file2.txt | grep '^<'| awk '{$1="";print $0}' | sed 's/^ *//'
Is there a better more efficient way to find the lines in one file... (5 Replies)
Discussion started by: raptor25
5 Replies
7. Shell Programming and Scripting
Dear people, I got a problem with an scrip using wget to download pdf-files from an website which uses session-cookies.
Background: for university its quite nasty to look up weekly which new homeworks, papers etc. are available on the different sites of the universites chairs. So I wanted a... (1 Reply)
Discussion started by: jackomo
1 Replies
8. Shell Programming and Scripting
wget -i genedx.txt
The code above will download multiple pdf files from a site, but how can i download and convert these to .txt?
I have attached the master list (genedx.txt - which contains the url and file names)
as well as the two PDF's that are downloaded. I am trying to have those... (7 Replies)
Discussion started by: cmccabe
7 Replies
9. Proxy Server
I cannot download anything using wget in centos 6.5 and 7. But I can update yum etc.
# wget https://wordpress.org/latest.tar.gz
--2014-10-23 13:50:23-- https://wordpress.org/latest.tar.gz
Resolving wordpress.org... 66.155.40.249, 66.155.40.250
Connecting to wordpress.org|66.155.40.249|:443...... (3 Replies)
Discussion started by: nirosha
3 Replies
LEARN ABOUT DEBIAN
httpindex
httpindex(1) General Commands Manual httpindex(1)
NAME
httpindex - HTTP front-end for SWISH++ indexer
SYNOPSIS
wget [ options ] URL... 2>&1 | httpindex [ options ]
DESCRIPTION
httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1). The files (in a copy of the remote direc-
tory structure) can be kept, deleted, or replaced with their descriptions after indexing.
OPTIONS
wget Options
The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and -w. (See the
EXAMPLE.)
httpindex Options
httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V.
The following options are unique to httpindex:
-d Replace the text of local copies of retrieved files with their descriptions after they have been indexed. This is useful to display
file descriptions in search results without having to have complete copies of the remote files thus saving filesystem space. (See
the extract_description() function in WWW(3) for details about how descriptions are extracted.)
-D Delete the local copies of retrieved files after they have been indexed. This prevents your local filesystem from filling up with
copies of remote files.
EXAMPLE
To index all HTML and text files on a remote web server keeping descriptions locally:
wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 |
httpindex -d -e'html:*.html,text:*.txt'
Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex.
EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise.
CAVEATS
In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options (because the Perl
script uses the standard GetOpt::Std package for processing command-line options that doesn't). The last of any of those options ``wins.''
The work-around is to use multiple values for those options seperated by commas to a single one of those options. For example, if you want
to do:
httpindex -e'html:*.html' -e'text:*.txt'
do this instead:
httpindex -e'html:*.html,text:*.txt'
SEE ALSO
index++(1), wget(1), WWW(3)
AUTHOR
Paul J. Lucas <pauljlucas@mac.com>
SWISH++ August 2, 2005 httpindex(1)