Sponsored Content
Full Discussion: Newbie Python Url Scraper
Top Forums Shell Programming and Scripting Newbie Python Url Scraper Post 302816739 by metallica1973 on Tuesday 4th of June 2013 01:12:37 PM
Old 06-04-2013
Newbie Python Url Scraper

I setup Zoneminder and have been playing around with setting up a couple of Wanscam PTZ ip cameras in which I have been running into road blocks with streaming and etc. I cant find much information on the camera and its webserver that sits on it and wanted to get a an absolute directory structure of the webserver on the camera. I tried using:
Code:
wget --spider -r 192.168.3.3:80
Spider mode enabled. Check if remote file exists.
--2013-06-04 13:00:49--  (try: 5)  http://192.168.3.3/
Connecting to 192.168.3.3:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

Spider mode enabled. Check if remote file exists.
--2013-06-04 13:00:54--  (try: 6)  http://192.168.3.3/
Connecting to 192.168.3.3:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

Spider mode enabled. Check if remote file exists.
--2013-06-04 13:01:00--  (try: 7)  http://192.168.3.3/
Connecting to 192.168.3.3:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

Spider mode enabled. Check if remote file exists.
--2013-06-04 13:01:07--  (try: 8)  http://192.168.3.3/
Connecting to 192.168.3.3:80... connected.
HTTP request sent, awaiting response... Read error (Connection reset by peer) in headers.
Retrying.

but doesnt find a thing. I know it has a webserver that reside on TCP:80 because I can view the camera through it. I have been attempting to use Pythons "scrapy" but can understand how to tell it to crawl and find the directory structure as opposed to where to start looking for it. This is what I have so far:
Code:
 #!/usr/bin/env python
# encoding=utf-8

from scrapy.spider import BaseSpider
from scrapy.http import Request
from scrapy.http import FormRequest
from scrapy.selector import HtmlXPathSelector
from scrapy import log
import sys
### Kludge to set default encoding to utf-8
reload(sys)
sys.setdefaultencoding('utf-8')

class PTZcamera(BaseSpider):
      name = "camera"
      allowed_domains = ["http://192.168.3.3:80"]
      #start_urls = [""]

      def parse(self, response):
          pass

but doesn't produce much. I would like an output in which is display on the absolute path of the directory on the webserver like:
Code:
http://192.168.3.3/cgi-bin/blah
http://192.168.3.3/cgi-bin/blah2
http://192.168.3.3/video/blah1
http://192.168.3.3/video/blah2
...
...
...

Can someone point me in the correct direction?
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

2. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

3. Programming

NEWBIE QUESTION: python 3 or 2.6.x

I'm a newbie and want to learn a programming language, willy-nilly I picked python... Should I go with 2.6.x which at first glance seems extremely well documented, or should I go with 3.0, which is new and shiny?! I want...no...I'm going to NEED fantastic documentation or I'm going to fail... (2 Replies)
Discussion started by: guptaxpn
2 Replies

4. UNIX for Dummies Questions & Answers

ReDirecting a URL to another URL - Linux

Hello, I need to redirect an existing URL, how can i do that? There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this? This is on Unix boxes Linux. example: https://m45.testing.address.net/host.php make it so the... (3 Replies)
Discussion started by: SkySmart
3 Replies

5. UNIX for Dummies Questions & Answers

UNIX newbie NEWBIE question!

Hello everyone, Just started UNIX today! In our school we use solaris. I just want to know how do I setup Solaris 10 not the GUI one, the one where you have to type the commands like ECHO, ls, pwd, etc... I have windows xp and I also have vmware. I hope I am not missing anything! :p (4 Replies)
Discussion started by: Hanamachi
4 Replies

6. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

7. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

8. Shell Programming and Scripting

Python Newbie Question Regex

I starting teaching myself python and am stuck on trying to understand why I am not getting the output that I want. Long story short, I am using PDB for debugging and here my function in which I am having my issue: import re ... ... ... def find_all_flvs(url): soup =... (1 Reply)
Discussion started by: metallica1973
1 Replies

9. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies
AGGREGATE-IOS(1)					      General Commands Manual						  AGGREGATE-IOS(1)

NAME
aggregate-ios - optimise a concatenated set of cisco/IOS prefix filters to help make them nice and short. SYNOPSIS
aggregate-ios <source_config >optimised_config DESCRIPTION
Takes cisco IOS configuration on stdin, and optimises any prefix filters found using aggregate(1). Optimised filters are produced on std- out. OPTIONS
None. DIAGNOSTICS
Any diagnostics produced by aggregate(1) are passed through on stderr. EXAMPLES
The following configuration fragment: ip prefix-list AS65530 description Foo, Inc ip prefix-list AS65530 permit 10.1.0.0/16 ip prefix-list AS65530 permit 10.2.0.0/16 ip prefix-list AS65530 permit 10.2.1.0/24 ip prefix-list AS65530 permit 10.3.0.0/16 ip prefix-list AS65531 description Bar.Com ip prefix-list AS65531 seq 5 permit 192.168.1.0/24 ip prefix-list AS65531 seq 10 permit 192.168.2.0/24 ip prefix-list AS65531 seq 15 permit 192.168.0.0/19 is optimised as follows: ip prefix-list AS65530 permit 10.1.0.0/16 le 24 ip prefix-list AS65530 permit 10.2.0.0/15 le 24 ip prefix-list AS65531 permit 192.168.0.0/19 le 24 SEE ALSO
aggregate(1) HISTORY
Aggregate-ios was written by Joe Abley <jabley@mfnx.net>. BUGS
All those in aggregate(1) and then some :) Joe Abley 2000 November 27 AGGREGATE-IOS(1)
All times are GMT -4. The time now is 01:31 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy