It looks like it's because they're embedded in javascript/flash. I hate when sites do that. Try these three commands:
Explanation:
After wget gets the page, the links in the raw HTML look like this:
I got the awk line from someone here and it's very useful. When it see "theFile=", it extracts everything until it runs into "&". There is some extra gibberish for some reason but piping it to "grep mp3" gets rid of it and every mp3 link is duplicated so piping to uniq takes care of that. The output goes to list and "wget -i list" gets the links in the list file.
That gets 59 links, I hope that's all of them. I didn't see any wav files so I just did it for mp3s.
I need to download some files from a remote server using ftp. I have ftp'd into the site. I then do an mget * to retrieve all of the data files. Everything seems to proceed normally and I am given feedback that the files were downloaded. Now if I go into the DOS Shell or Windows explorer, it list... (5 Replies)
I am doing an ftp of around 1010 files and I am using mput for this. For some reason its only transferring 10 or 20 files and the rest are
not getting transferred. There is some socket error in the log. is there an issue if we have more than 50 or so files for mput.
here is the o/p in the log... (2 Replies)
Hi everybody, I would greatly appreciate some expertise in this matter. I am trying find an efficient way to batch download files from a website and rename each file with the url it originated from (from the CLI). (ie. Instead of xyz.zip, the output file would be http://www.abc.com/xyz.zip) A... (10 Replies)
Hello,
I have setup Cherokee web server and php 5.2 in Opensolaris zone. Problem is that all .php files are downloaded from web server and not served when I use IP address instead of DNS name in web brovser.
Example: test.mydomain.com <-- php works
192.168.0.10/index.php <--... (3 Replies)
Hi,
I need to basically get a list of all the tarballs located at uri
I am currently doing a wget on urito get the index.html page
Now this index page contains the list of uris that I want to use in my bash script.
can someone please guide me ,.
I am new to Linux and shell scripting.
... (5 Replies)
Hello All,
I have gone through Google and came to know that we can download images from a site using wget.
Now I am been asked to check whether an image is populated in a site or not. If yes, please send that image to an address as an attachment..
Say for example, the site is Wiki -... (6 Replies)
Hi,
I need to implement below logic to download files daily from a URL.
* Need to check if it is yesterday's file (YYYY-DD-MM.dat)
* If present then download from URL (sample_url/2013-01-28.dat)
* Need to implement wait logic if not present
* if it still not able to find the file... (1 Reply)
hello.
How can I detect within script, that the downloaded file had not a correct size.
linux:~ # wget --limit-rate=20k --ignore-length -O /Software_Downloaded/MULTIMEDIA_ADDON/skype-4.1.0.20-suse.i586.rpm ... (6 Replies)
Need assistance in writing a for loop script or any looping method. Below is the code where i can get all the files from the URL . There are about 80 files in the URL .Every day the files get updated . Script that i wanted is the loop must keep on running till it gets 80 files. It matches the count... (5 Replies)
Hello,
I have a server that I have to ftp files off and they all start SGRD and are followed by 6 numbers.
SGRD000001
SGRD000002
SGRD000003
The script I have will run every 10 mins to pick up files as new ones will be coming in all the time and what I want to do is delete the files I have... (7 Replies)
Discussion started by: sph90457
7 Replies
LEARN ABOUT DEBIAN
httpindex
httpindex(1) General Commands Manual httpindex(1)NAME
httpindex - HTTP front-end for SWISH++ indexer
SYNOPSIS
wget [ options ] URL... 2>&1 | httpindex [ options ]
DESCRIPTION
httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1). The files (in a copy of the remote direc-
tory structure) can be kept, deleted, or replaced with their descriptions after indexing.
OPTIONS
wget Options
The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and -w. (See the
EXAMPLE.)
httpindex Options
httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V.
The following options are unique to httpindex:
-d Replace the text of local copies of retrieved files with their descriptions after they have been indexed. This is useful to display
file descriptions in search results without having to have complete copies of the remote files thus saving filesystem space. (See
the extract_description() function in WWW(3) for details about how descriptions are extracted.)
-D Delete the local copies of retrieved files after they have been indexed. This prevents your local filesystem from filling up with
copies of remote files.
EXAMPLE
To index all HTML and text files on a remote web server keeping descriptions locally:
wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 |
httpindex -d -e'html:*.html,text:*.txt'
Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex.
EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise.
CAVEATS
In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options (because the Perl
script uses the standard GetOpt::Std package for processing command-line options that doesn't). The last of any of those options ``wins.''
The work-around is to use multiple values for those options seperated by commas to a single one of those options. For example, if you want
to do:
httpindex -e'html:*.html' -e'text:*.txt'
do this instead:
httpindex -e'html:*.html,text:*.txt'
SEE ALSO
index++(1), wget(1), WWW(3)AUTHOR
Paul J. Lucas <pauljlucas@mac.com>
SWISH++ August 2, 2005 httpindex(1)