Sponsored Content
Full Discussion: Newbie Python Url Scraper
Top Forums Shell Programming and Scripting Newbie Python Url Scraper Post 302816787 by Corona688 on Tuesday 4th of June 2013 02:44:17 PM
Old 06-04-2013
Try removing --spider and see what you get. An embedded HTTP server may do odd things when you do things it wasn't expecting, like checking for the existence of a file instead of actually downloading one.

There's not a generic way to figure out all possible files on a web server if a page doesn't link it.

What page would you be accessing it from if you used an ordinary web browser?

Last edited by Corona688; 06-04-2013 at 03:56 PM..
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

2. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

3. Programming

NEWBIE QUESTION: python 3 or 2.6.x

I'm a newbie and want to learn a programming language, willy-nilly I picked python... Should I go with 2.6.x which at first glance seems extremely well documented, or should I go with 3.0, which is new and shiny?! I want...no...I'm going to NEED fantastic documentation or I'm going to fail... (2 Replies)
Discussion started by: guptaxpn
2 Replies

4. UNIX for Dummies Questions & Answers

ReDirecting a URL to another URL - Linux

Hello, I need to redirect an existing URL, how can i do that? There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this? This is on Unix boxes Linux. example: https://m45.testing.address.net/host.php make it so the... (3 Replies)
Discussion started by: SkySmart
3 Replies

5. UNIX for Dummies Questions & Answers

UNIX newbie NEWBIE question!

Hello everyone, Just started UNIX today! In our school we use solaris. I just want to know how do I setup Solaris 10 not the GUI one, the one where you have to type the commands like ECHO, ls, pwd, etc... I have windows xp and I also have vmware. I hope I am not missing anything! :p (4 Replies)
Discussion started by: Hanamachi
4 Replies

6. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

7. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

8. Shell Programming and Scripting

Python Newbie Question Regex

I starting teaching myself python and am stuck on trying to understand why I am not getting the output that I want. Long story short, I am using PDB for debugging and here my function in which I am having my issue: import re ... ... ... def find_all_flvs(url): soup =... (1 Reply)
Discussion started by: metallica1973
1 Replies

9. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies
Web::Scraper::Filter(3pm)				User Contributed Perl Documentation				 Web::Scraper::Filter(3pm)

NAME
Web::Scraper::Filter - Base class for Web::Scraper filters SYNOPSIS
package Web::Scraper::Filter::YAML; use base qw( Web::Scraper::Filter ); use YAML (); sub filter { my($self, $value) = @_; YAML::Load($value); } 1; use Web::Scraper; my $scraper = scraper { process ".yaml-code", data => [ 'TEXT', 'YAML' ]; }; DESCRIPTION
Web::Scraper::Filter is a base class for text filters in Web::Scraper. You can create your own text filter by subclassing this module. There are two ways to create and use your custom filter. If you name your filter Web::Scraper::Filter::Something, you just call: process $exp, $key => [ 'TEXT', 'Something' ]; If you declare your filter under your own namespace, like 'MyApp::Filter::Foo', process $exp, $key => [ 'TEXT', '+MyApp::Filter::Foo' ]; You can also inline your filter function without creating a filter class: process $exp, $key => [ 'TEXT', sub { s/foo/bar/ } ]; Note that this function munges $_ and returns the count of replacement. Filter code special cases if the return value of the callback is number and $_ value is updated. You can, of course, stack filters like: process $exp, $key => [ '@href', 'Foo', '+MyApp::Filter::Bar', &baz ]; AUTHOR
Tatsuhiko Miyagawa perl v5.14.2 2009-03-24 Web::Scraper::Filter(3pm)
All times are GMT -4. The time now is 02:09 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy