Sponsored Content
Top Forums Shell Programming and Scripting write page source to standard output Post 302216328 by wxornot on Friday 18th of July 2008 04:40:22 PM
Old 07-18-2008
write page source to standard output

I'm new to PERL, but I want to take the page source and write it to a file or standard output. I used perl.org as a test website. Here is the script:

use strict;
use warnings;
use LWP::Simple;

getprint('http://www.perl.org') or die 'Unable to get page';

exit 0;

When I run this, I get the following error:

"500 Can't connect to The Perl Directory - perl.org (Bad hostname 'www.perl.org') <URL:http://www.perl.org>

Bear in mind I'm a Perl newbie, so I'm sure I'm missing something basic.

thanks in advance,
 

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Getting source code of a page

I want to download a particular page from the internet and get the source code of the page in html format. I want to parse the source code to find a specific parameters using grep command. could someone tell me the linux command to download a specific page and parse the source code of it. ... (1 Reply)
Discussion started by: ahamed
1 Replies

2. UNIX for Dummies Questions & Answers

Redirect Standard output and standard error into spreadsheet

Hey, I'm completely new at this and I was wondering if there is a way that I would be able to redirect the log files in a directories standard output and standard error into and excel spreadsheet in anyway? Please remember don't use too advanced of terminology as I just started using shell... (6 Replies)
Discussion started by: killaram
6 Replies

3. Shell Programming and Scripting

sed command works from cmd line to standard output but will not write to file

Hi all .... vexing problem here ... I am using sed to replace some special characters in a .txt file: sed -e 's/_<ED>_/_355_/g;s/_<F3>_/_363_/g;s/_<E1>_/_341_/g' filename.txt This command replaces <ED> with í , <F3> with ó and <E1> with á. When I run the command to standard output, it works... (1 Reply)
Discussion started by: crumplecrap
1 Replies

4. UNIX for Dummies Questions & Answers

display command output page per page

Good afternoon, I wonder how i could use unix commands to ease the reading of long command result output ? like the "php -i" or any other command that returns a long answer. I could not find the right terms to Google it or search the forum. Therefore I bother you with this question. ... (3 Replies)
Discussion started by: Mat_k
3 Replies

5. Shell Programming and Scripting

web page source cleanup

is it possible to pass webpages to remove all tag style information, but leave the tag... say I have <h1 style='font-size: xxx; color: xxxxxx'>headline 1</h1> i want to get <h1>headline 1</h1> BTW, i got an oneliner here to remove all tags: sed -n '/^$/!{s/<*>//g;p; Thanks a... (4 Replies)
Discussion started by: dtdt
4 Replies

6. Shell Programming and Scripting

Performing extractions on web source page

I have downloaded a web source page to a file. I then egrep a single word to extract a line containing it to another file. I then cat the second file and remove everything before a word and after a second word to capture the phrase desired. This did not work. I used vi to validate that the 2... (1 Reply)
Discussion started by: slak0
1 Replies

7. Shell Programming and Scripting

Save page source, including javascript

I need to get the source code of a webpage. I have tried to use wget and curl, but it doesn't show the necessary javascript part of the source. I don't have to execute it, only to view the source. How do I do that? (1 Reply)
Discussion started by: locoroco
1 Replies

8. Red Hat

Command understanding the output file destination in case of standard output!!!!!

I ran the following command. cat abc.c > abc.c I got message the following message from command cat: cat: abc.c : input file is same as the output file How the command came to know of the destination file name as the command is sending output to standard file. (3 Replies)
Discussion started by: ravisingh
3 Replies
LWP::Simple(3pm)					User Contributed Perl Documentation					  LWP::Simple(3pm)

NAME
LWP::Simple - simple procedural interface to LWP SYNOPSIS
perl -MLWP::Simple -e 'getprint "http://www.sn.no"' use LWP::Simple; $content = get("http://www.sn.no/"); die "Couldn't get it!" unless defined $content; if (mirror("http://www.sn.no/", "foo") == RC_NOT_MODIFIED) { ... } if (is_success(getprint("http://www.sn.no/"))) { ... } DESCRIPTION
This module is meant for people who want a simplified view of the libwww-perl library. It should also be suitable for one-liners. If you need more control or access to the header fields in the requests sent and responses received, then you should use the full object-oriented interface provided by the "LWP::UserAgent" module. The following functions are provided (and exported) by this module: get($url) The get() function will fetch the document identified by the given URL and return it. It returns "undef" if it fails. The $url argument can be either a string or a reference to a URI object. You will not be able to examine the response code or response headers (like 'Content-Type') when you are accessing the web using this function. If you need that information you should use the full OO interface (see LWP::UserAgent). head($url) Get document headers. Returns the following 5 values if successful: ($content_type, $document_length, $modified_time, $expires, $server) Returns an empty list if it fails. In scalar context returns TRUE if successful. getprint($url) Get and print a document identified by a URL. The document is printed to the selected default filehandle for output (normally STDOUT) as data is received from the network. If the request fails, then the status code and message are printed on STDERR. The return value is the HTTP response code. getstore($url, $file) Gets a document identified by a URL and stores it in the file. The return value is the HTTP response code. mirror($url, $file) Get and store a document identified by a URL, using If-modified-since, and checking the Content-Length. Returns the HTTP response code. This module also exports the HTTP::Status constants and procedures. You can use them when you check the response code from getprint(), getstore() or mirror(). The constants are: RC_CONTINUE RC_SWITCHING_PROTOCOLS RC_OK RC_CREATED RC_ACCEPTED RC_NON_AUTHORITATIVE_INFORMATION RC_NO_CONTENT RC_RESET_CONTENT RC_PARTIAL_CONTENT RC_MULTIPLE_CHOICES RC_MOVED_PERMANENTLY RC_MOVED_TEMPORARILY RC_SEE_OTHER RC_NOT_MODIFIED RC_USE_PROXY RC_BAD_REQUEST RC_UNAUTHORIZED RC_PAYMENT_REQUIRED RC_FORBIDDEN RC_NOT_FOUND RC_METHOD_NOT_ALLOWED RC_NOT_ACCEPTABLE RC_PROXY_AUTHENTICATION_REQUIRED RC_REQUEST_TIMEOUT RC_CONFLICT RC_GONE RC_LENGTH_REQUIRED RC_PRECONDITION_FAILED RC_REQUEST_ENTITY_TOO_LARGE RC_REQUEST_URI_TOO_LARGE RC_UNSUPPORTED_MEDIA_TYPE RC_INTERNAL_SERVER_ERROR RC_NOT_IMPLEMENTED RC_BAD_GATEWAY RC_SERVICE_UNAVAILABLE RC_GATEWAY_TIMEOUT RC_HTTP_VERSION_NOT_SUPPORTED The HTTP::Status classification functions are: is_success($rc) True if response code indicated a successful request. is_error($rc) True if response code indicated that an error occurred. The module will also export the LWP::UserAgent object as $ua if you ask for it explicitly. The user agent created by this module will identify itself as "LWP::Simple/#.##" and will initialize its proxy defaults from the environment (by calling $ua->env_proxy). CAVEAT
Note that if you are using both LWP::Simple and the very popular CGI.pm module, you may be importing a "head" function from each module, producing a warning like "Prototype mismatch: sub main::head ($) vs none". Get around this problem by just not importing LWP::Simple's "head" function, like so: use LWP::Simple qw(!head); use CGI qw(:standard); # then only CGI.pm defines a head() Then if you do need LWP::Simple's "head" function, you can just call it as "LWP::Simple::head($url)". SEE ALSO
LWP, lwpcook, LWP::UserAgent, HTTP::Status, lwp-request, lwp-mirror perl v5.14.2 2012-02-18 LWP::Simple(3pm)
All times are GMT -4. The time now is 09:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy