Sponsored Content
Top Forums Shell Programming and Scripting Script to scrape page for and save data Post 302870105 by molwiko on Friday 1st of November 2013 07:56:19 AM
Old 11-01-2013
Actually I am working on module about cars on Joomla and I would like some real data to store, but the structure of DB it's different I just want the data, I don't know why it can be any permission issue about that.
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

How to save as image from a web page

I used flot to create a graph and I would like to be able to save/export the graph as an image. In firefox on windows you can just ctl rt-click and you have a save as image feature (which I can automate with js) but...I need this to work on a linux browser. On linux in firefox I can print preview... (11 Replies)
Discussion started by: vincaStar
11 Replies

2. Shell Programming and Scripting

How to pass data from server (CGI script) to client (html page)

Hi I know how to pass data from client side (html file) to server using CGI script (POST method). I also know how to re-create the html page from server side after receiving the data (using printf). However I want to write static pages on client side (only the structure), and only to pass... (0 Replies)
Discussion started by: naamabm
0 Replies

3. Shell Programming and Scripting

Save page source, including javascript

I need to get the source code of a webpage. I have tried to use wget and curl, but it doesn't show the necessary javascript part of the source. I don't have to execute it, only to view the source. How do I do that? (1 Reply)
Discussion started by: locoroco
1 Replies

4. Shell Programming and Scripting

Get Permissions and save to data

Hi all; I have the following code which gives me kind of what I need: #!/usr/bin/perl use Fcntl ':mode'; # if ($ARGV ne "") { $filename = $ARGV; } else { print "Please specify a file!\n"; exit; } # if... (2 Replies)
Discussion started by: gvolpini
2 Replies

5. Shell Programming and Scripting

script for adding page number before page breaks

Hi, If there is an expert that can help: I have many txt files that are produced from pdftotext that include page breaks the page breaks seem to be unix style hex 0C. I want to add page numbers before each page break as in : Page XXXX Regards antman (9 Replies)
Discussion started by: antman
9 Replies

6. Shell Programming and Scripting

Open Page and save it using mozilla

HI Guys, I have one command which can open page and i want to save and exit from it. pf@home> mozilla 181.131.193.10/g/report.txt It will open one page now how can i save it. Thanks (1 Reply)
Discussion started by: pareshkp
1 Replies

7. Shell Programming and Scripting

Scrape 10 million pages and save the raw html data in mysql database

I have a list of 10 million page urls. I want those pages scraped and saved in the mysql database as raw html. I own a Linux VPS server with 1GB RAM and WHM/cPanel. I would like to scrape at least 100,000 urls in 24 hours. So can anyone give me some sample shell scripting code? (4 Replies)
Discussion started by: Viruthagiri
4 Replies

8. UNIX for Dummies Questions & Answers

Get a data and save

If I have a A.log 1 Air Flow Monitor : 34.070 Degrees C 2 Air Flow Monitor : 41.730 Degrees C 3 Air Flow Monitor : 35.340 Degrees C 4 Air Flow Monitor : 33.370 Degrees C 5 Air Flow Monitor : 36.770 Degrees C 6 Air Flow Monitor : 45.910 Degrees C 7 Air Flow Monitor ... (1 Reply)
Discussion started by: sabercats
1 Replies

9. Shell Programming and Scripting

Run sql query in shell script and output data save as delimited text

I want to run sql query in shell script and output data save as delimited text (delimited text would be comma) Code: SPOOL_FILE=/pgedw/dan.txt SQL=/pgedw/dan.sql sqlplus -s username/password@myhost:port/servicename <<EOF set head on set COLSEP , set linesize 32767 SET TRIMSPOOL ON SET... (8 Replies)
Discussion started by: Jaganjag
8 Replies
HTTP::Recorder(3pm)					User Contributed Perl Documentation				       HTTP::Recorder(3pm)

NAME
HTTP::Recorder - record interaction with websites SYNOPSIS
Using HTTP::Recorder as a Web Proxy Set HTTP::Recorder as the user agent for a proxy, and it rewrites HTTP responses so that additional requests can be recorded. The Proxy Script For quick start, run the httprecorder script httprecorder This will open a local proxy on port 8080, and will dump the recorded traffic to a file named http_traffic in the current directory. use the -help parameter for usage info Start the proxy script, then change the settings in your web browser so that it will use this proxy for web requests. For more information about proxy settings and the default port, see HTTP::Proxy. The script will be recorded in the specified file, and can be viewed and modified via the control panel. For better control, use this example: #!/usr/bin/perl use HTTP::Proxy; use HTTP::Recorder; my $proxy = HTTP::Proxy->new(); # create a new HTTP::Recorder object my $agent = new HTTP::Recorder; # set the log file (optional) $agent->file("/tmp/myfile"); # set HTTP::Recorder as the agent for the proxy $proxy->agent( $agent ); # start the proxy $proxy->start(); Start Recording Now you can use your browser as your normally would, and your actions will be recorded in the file you specified. Alternatively, you can start recording from the Control Panel. Using the Control Panel If you have Javascript enabled in your browser, go to the HTTP::Recorder control URL (http://http-recorder by default), optionally type a URL into the "Goto page" field, and click "Go". In the new window, interact with web sites as you normally do, including typing a new address into the address field. The Control Panel will be updated after each recorded action. The Control Panel allows you to modify, delete, or save your script. SSL sessions As of version 0.03, HTTP::Recorder can record SSL sessions. To begin recording an SSL session, go to the control URL (http://http-recorder/ by default), and enter the initial URL. Then, interact with the web site as usual. Script output By default, HTTP::Recorder outputs WWW::Mechanize scripts. However, you can override HTTP::Recorder::Logger to output other types of scripts. Functions new Creates and returns a new HTTP::Recorder object, referred to as the 'agent'. $agent->prefix([$value]) Get or set the prefix string that HTTP::Recorder uses for rewriting responses. $agent->control([$value]) Get or set the URL of the control panel. By default, the control URL is 'http-recorder'. The control URL will display a control panel which will allow you to view and edit the current script. $agent->logger([$value]) Get or set the logger object. The default logger is a HTTP::Recorder::Logger, which generates WWW::Mechanize scripts. $agent->ignore_favicon([0|1]) Get or set ignore_favicon flag that causes HTTP::Recorder to skip logging requests favicon.ico files. The value is 1 by default. $agent->file([$value]) Get or set the filename for generated scripts. The default is '/tmp/scriptfile'. Bugs, Missing Features, and other Oddities Javascript WWW::Mechanize can't play back Javascript actions, and HTTP::Recorder doesn't record them. Why are my images corrupted? HTTP::Recorder only tries to rewrite responses that are of type text/*, which it determines by reading the Content-Type header of the HTTP::Response object. However, if the received image gives the wrong Content-Type header, it may be corrupted by the recorder. While this may not be pleasant to look at, it shouldn't have an effect on your recording session. See Also See also LWP::UserAgent, WWW::Mechanize, HTTP::Proxy. Requests &; Bugs Please submit any feature requests, suggestions, bugs, or patches at http://rt.cpan.org/, or email to bug-HTTP-Recorder@rt.cpan.org. If you're submitting a bug of the type "X doesn't record correctly," be sure to include a (preferably short and simple) HTML page that demonstrates the problem, and a clear explanation of a) what it does that it shouldn't, and b) what it should do instead. Author Copyright 2003-2005 by Linda Julien <leira@cpan.org> Maintained by Shmuel Fomberg <semuelf@cpan.org> Released under the GNU Public License. perl v5.14.2 2012-04-23 HTTP::Recorder(3pm)
All times are GMT -4. The time now is 09:32 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy