Sponsored Content
Top Forums Shell Programming and Scripting help with wget and 404 errors Post 302687515 by problemss on Thursday 16th of August 2012 05:19:48 PM
Old 08-16-2012
@fpmurphy do you know of anything to convert a dynamic page to static and store it locally? The reason i want to store this locally is because 1) the site causes timeouts with some scripts i have 2) I do no want to overload or be the cause of extra traffic on the server because of my scripts

@Corona688 I added the user agent, but already had a referer. It still fails. My query looks like this:

Code:
wget --directory-prefix=/Users/problemss/Desktop --proxy=off -Q0 --user-agent=Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1) --passive-ftp --header=REFERER:http://accuscore.com -k -r -l2 --progress=dot:binary http://accuscore.com/fantasy-sports/nfl-fantasy-sports/Current-Week-DEF-ST

The response is:

Code:
Resolving accuscore.com... 184.106.172.20
Connecting to accuscore.com|184.106.172.20|:80... connected.
HTTP request sent, awaiting response... 404 NOT FOUND
2012-08-16 14:13:25 ERROR 404: NOT FOUND.

 

8 More Discussions You Might Find Interesting

1. Web Development

mod_rewrite RewriteMap - possible to 404?

In my strenuous efforts to get SEO-friendly urls, I'm using a rewrite map in my apache setup: RewriteEngine on RewriteOptions MaxRedirects=5 RewriteMap seo prg:/Applications/MAMP/htdocs/map.php #map requests for the original file to the new SEO friendly urls RewriteCond... (0 Replies)
Discussion started by: sneakyimp
0 Replies

2. Web Development

HTTP 404 Error Fetches File from Another Server

Has any seen any PHP or other scripting code that will executive on a 404 "File Not Found' error and then fetch the requested file from a different server? (0 Replies)
Discussion started by: Neo
0 Replies

3. Web Development

[.htaccess] Denegar IP Con Error 404

Saludos amigos programadores de Web. Mi Problema es, que necesito denegar una IP desde el fichero .htaccess, pero no con el tipico error 403 (Forbidden). sino con el error 404 (Not found). Para quitarme de encima una IP fija que visita constantemente mi servidor para llenar de Spam mis... (1 Reply)
Discussion started by: Ignacio A
1 Replies

4. Red Hat

HTTP/1.1 404 Not Found error in Web Server

I am running 2 jboss instances with ports 8585 and 8686 in my web server. Now trying to get header using the command curl -s --connect-timeout 360 -m 360 --head http: // localhost:8686/ then i get the following error HTTP/1.1 404 Not Found Server: Apache-Coyote/1.1 Content-Length: 0... (1 Reply)
Discussion started by: hridan
1 Replies

5. Shell Programming and Scripting

Wget - working in browser but cannot download from wget

Hi, I need to download a zip file from my the below US govt link. https://www.sam.gov/SAMPortal/extractfiledownload?role=WW&version=SAM&filename=SAM_PUBLIC_MONTHLY_20160207.ZIP I only have wget utility installed on the server. When I use the below command, I am getting error 403... (2 Replies)
Discussion started by: Prasannag87
2 Replies

6. What is on Your Mind?

New Responsive 404 Page for UNIX.com

Just created (actually, only modified... it was created by ShoutOut) a new responsive 404 "not found" page with the help of ShoutOut free templates. https://www.unix.com/status/404.html Same for 401 and 403 errors. Picture sans animation: ... (2 Replies)
Discussion started by: Neo
2 Replies

7. What is on Your Mind?

YouTube: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search

Getting a bit more comfortable making quick YT videos in 4K, here is: Search Engine Optimization | How To Fix Soft 404 Errors and A.I. Tales from Google Search Console https://youtu.be/I6b9T2qcqFo (0 Replies)
Discussion started by: Neo
0 Replies

8. What is on Your Mind?

Google Webmaster Tools Shows Problems with Soft 404 Errors

Well, Sorry, but I cannot seem to fix the problem with the steady decline of ranking for unix.com pages with Google. Google Webmaster Tools show that they are dropping our pages from the indexes more and more because of "Soft 404" errors which started after we moved to the new data center. ... (18 Replies)
Discussion started by: Neo
18 Replies
httpindex(1)						      General Commands Manual						      httpindex(1)

NAME
httpindex - HTTP front-end for SWISH++ indexer SYNOPSIS
wget [ options ] URL... 2>&1 | httpindex [ options ] DESCRIPTION
httpindex is a front-end for index++(1) to index files copied from remote servers using wget(1). The files (in a copy of the remote direc- tory structure) can be kept, deleted, or replaced with their descriptions after indexing. OPTIONS
wget Options The wget(1) options that are required are: -A, -nv, -r, and -x; the ones that are highly recommended are: -l, -nh, -t, and -w. (See the EXAMPLE.) httpindex Options httpindex accepts the same short options as index++(1) except for -H, -I, -l, -r, -S, and -V. The following options are unique to httpindex: -d Replace the text of local copies of retrieved files with their descriptions after they have been indexed. This is useful to display file descriptions in search results without having to have complete copies of the remote files thus saving filesystem space. (See the extract_description() function in WWW(3) for details about how descriptions are extracted.) -D Delete the local copies of retrieved files after they have been indexed. This prevents your local filesystem from filling up with copies of remote files. EXAMPLE
To index all HTML and text files on a remote web server keeping descriptions locally: wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 | httpindex -d -e'html:*.html,text:*.txt' Note that you need to redirect wget(1)'s output from standard error to standard output in order to pipe it to httpindex. EXIT STATUS
Exits with a value of zero only if indexing completed sucessfully; non-zero otherwise. CAVEATS
In addition to those for index++(1), httpindex does not correctly handle the use of multiple -e, -E, -m, or -M options (because the Perl script uses the standard GetOpt::Std package for processing command-line options that doesn't). The last of any of those options ``wins.'' The work-around is to use multiple values for those options seperated by commas to a single one of those options. For example, if you want to do: httpindex -e'html:*.html' -e'text:*.txt' do this instead: httpindex -e'html:*.html,text:*.txt' SEE ALSO
index++(1), wget(1), WWW(3) AUTHOR
Paul J. Lucas <pauljlucas@mac.com> SWISH++ August 2, 2005 httpindex(1)
All times are GMT -4. The time now is 08:38 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy