Sponsored Content
Top Forums Shell Programming and Scripting Script to scrape page for and save data Post 302870105 by molwiko on Friday 1st of November 2013 07:56:19 AM
Old 11-01-2013
Actually I am working on module about cars on Joomla and I would like some real data to store, but the structure of DB it's different I just want the data, I don't know why it can be any permission issue about that.
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

How to save as image from a web page

I used flot to create a graph and I would like to be able to save/export the graph as an image. In firefox on windows you can just ctl rt-click and you have a save as image feature (which I can automate with js) but...I need this to work on a linux browser. On linux in firefox I can print preview... (11 Replies)
Discussion started by: vincaStar
11 Replies

2. Shell Programming and Scripting

How to pass data from server (CGI script) to client (html page)

Hi I know how to pass data from client side (html file) to server using CGI script (POST method). I also know how to re-create the html page from server side after receiving the data (using printf). However I want to write static pages on client side (only the structure), and only to pass... (0 Replies)
Discussion started by: naamabm
0 Replies

3. Shell Programming and Scripting

Save page source, including javascript

I need to get the source code of a webpage. I have tried to use wget and curl, but it doesn't show the necessary javascript part of the source. I don't have to execute it, only to view the source. How do I do that? (1 Reply)
Discussion started by: locoroco
1 Replies

4. Shell Programming and Scripting

Get Permissions and save to data

Hi all; I have the following code which gives me kind of what I need: #!/usr/bin/perl use Fcntl ':mode'; # if ($ARGV ne "") { $filename = $ARGV; } else { print "Please specify a file!\n"; exit; } # if... (2 Replies)
Discussion started by: gvolpini
2 Replies

5. Shell Programming and Scripting

script for adding page number before page breaks

Hi, If there is an expert that can help: I have many txt files that are produced from pdftotext that include page breaks the page breaks seem to be unix style hex 0C. I want to add page numbers before each page break as in : Page XXXX Regards antman (9 Replies)
Discussion started by: antman
9 Replies

6. Shell Programming and Scripting

Open Page and save it using mozilla

HI Guys, I have one command which can open page and i want to save and exit from it. pf@home> mozilla 181.131.193.10/g/report.txt It will open one page now how can i save it. Thanks (1 Reply)
Discussion started by: pareshkp
1 Replies

7. Shell Programming and Scripting

Scrape 10 million pages and save the raw html data in mysql database

I have a list of 10 million page urls. I want those pages scraped and saved in the mysql database as raw html. I own a Linux VPS server with 1GB RAM and WHM/cPanel. I would like to scrape at least 100,000 urls in 24 hours. So can anyone give me some sample shell scripting code? (4 Replies)
Discussion started by: Viruthagiri
4 Replies

8. UNIX for Dummies Questions & Answers

Get a data and save

If I have a A.log 1 Air Flow Monitor : 34.070 Degrees C 2 Air Flow Monitor : 41.730 Degrees C 3 Air Flow Monitor : 35.340 Degrees C 4 Air Flow Monitor : 33.370 Degrees C 5 Air Flow Monitor : 36.770 Degrees C 6 Air Flow Monitor : 45.910 Degrees C 7 Air Flow Monitor ... (1 Reply)
Discussion started by: sabercats
1 Replies

9. Shell Programming and Scripting

Run sql query in shell script and output data save as delimited text

I want to run sql query in shell script and output data save as delimited text (delimited text would be comma) Code: SPOOL_FILE=/pgedw/dan.txt SQL=/pgedw/dan.sql sqlplus -s username/password@myhost:port/servicename <<EOF set head on set COLSEP , set linesize 32767 SET TRIMSPOOL ON SET... (8 Replies)
Discussion started by: Jaganjag
8 Replies
DJVUSERVE(1)							   DjVuLibre-3.5						      DJVUSERVE(1)

NAME
djvuserve - Generate indirect DjVu documents on the fly. DESCRIPTION
Program djvuserve is a CGI program that can be executed by a HTTP server for serving DjVu documents. This program is able to convert a bundled multi-page document into an indirect document on the fly. USING DJVUSERVE
Program djvuserve must first be installed as a CGI program for your web server. There are several ways to achieve this. The Apache web server, for instance, often defines a specific directory for CGI programs using the ScriptAlias directive. Assume that the file httpd.conf contains the following line: ScriptAlias /cgi-bin/ "/var/www/cgi-bin" It is then sufficient to create a small executable shell script /var/www/cgi-bin/djvuserve containing the following lines: #!/bin/sh exec /full/path/to/djvuserve Suppose that a large bundled multi-page DjVu document is available at the following URL. http://server/dir/doc.djvu The CGI program djvuserve lets you access this same document as an indirect multi-page DjVu document using the following URL. http://server/cgi-bin/djvuserve/dir/doc.djvu/index.djvu Serving indirect multi-page DjVu documents provides for efficiently browsing large document without transferring unnecessary pages over the network. See djvu(1) for more information. Furthermore djvuserve searches certain keywords among the CGI arguments of the URL. The keyword bundled forces serving a bundled document using http://server/cgi-bin/djvuserve/dir/doc.djvu?bundled The keyword download inserts a content disposition HTTP header that suggests to display a save dialog instead of displaying the document. http://server/cgi-bin/djvuserve/dir/doc.djvu?download USING DJVUSERVE AS A HANDLER
The Apache web server provides a way to automatically execute djvuserve for all DjVu documents. This can be achieved using the following directives in either the Apache configuration file or the .htaccess files. Action djvu-server /cgi-bin/djvuserve/ AddHandler djvu-server .djvu Apache then executes program djvuserve for serving all DjVu files. Providing the URL of DjVu file serves this DjVu file as usual, except that bundled multipage documents are converted to indirect documents on the fly. This convenience comes at the expense of the computa- tional cost of executing djvuserve whenever a DjVu file is requested. TECHNICAL DETAILS
Program djvuserve provides a mean to directly access any component of a bundled multi-page DjVu document can be accessed using an extended URL. Suppose that the component file representing page 1 is named p0001.djvu. The following URL provides a direct access to this page: http://server/cgi-bin/djvuserve/dir/doc.djvu/p0001.djvu It is preferred however to access individual pages using the CGI style arguments described in nsdejavu(1), as in the following URL. http://server/cgi-bin/djvuserve/dir/doc.djvu?djvuopts&page=12 The special component file name index.djvu is recognized as a request for the index of the corresponding indirect multi-page document. In fact, when you access a bundled document using djvuserve, the browser gets redirected to the following URL: http://server/cgi-bin/djvuserve/dir/doc.djvu/index.djvu and then behaves as if the bundled file was a directory containing the various component files of an equivalent indirect document. ACCESS CONTROL
Program djvuserve, like many CGI programs, bypasses a number of access protections established in a web server. Assume for instance that your web site contains DjVu files protected by a password. Program djvuserve knows nothing about this protection and will happily serve any DjVu file associated with a valid URL. Access control with djvuserve can be implemented by first remembering that the web server always executes program djvuserve via shell script /var/www/cgi-bin/djvuserve. This script can decide to execute the real program djvuserve on the basis of the target filename available in the environment variable PATH_TRANSLATED. There can be several such scripts providing access to various collections of DjVu files. Each of these scripts can be password protected using the usual methods supported by your web server. KNOWN BUGS
Hyperlinks specified using a relative URL may not work with djvuserve. These URLs are relative to the URL of the DjVu document. Yet djvuserve changes the apparent document URL http://server/dir/doc.djvu into the more complicated URL http://server/cgi-bin/djvuserve/dir/doc.djvu/index.djvu. The extra components change the interpretation of relative URLs. CREDITS
This program was written by Leon Bottou <leonb@users.sourceforge.com>. SEE ALSO
djvu(1), djvmcvt(1), nsdejavu(1) DjVuLibre-3.5 01/22/2002 DJVUSERVE(1)
All times are GMT -4. The time now is 03:18 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy