Sponsored Content
Top Forums Shell Programming and Scripting Script to scrape page for and save data Post 302870103 by joeyg on Friday 1st of November 2013 07:27:10 AM
Old 11-01-2013
Why do this, and do you have legal right to do so?

What you are asking for, especially because of the way you have worded the request, leads me to think that you do not have permission to essentially capture their database.
Please explain the purpose of this request.
 

9 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

How to save as image from a web page

I used flot to create a graph and I would like to be able to save/export the graph as an image. In firefox on windows you can just ctl rt-click and you have a save as image feature (which I can automate with js) but...I need this to work on a linux browser. On linux in firefox I can print preview... (11 Replies)
Discussion started by: vincaStar
11 Replies

2. Shell Programming and Scripting

How to pass data from server (CGI script) to client (html page)

Hi I know how to pass data from client side (html file) to server using CGI script (POST method). I also know how to re-create the html page from server side after receiving the data (using printf). However I want to write static pages on client side (only the structure), and only to pass... (0 Replies)
Discussion started by: naamabm
0 Replies

3. Shell Programming and Scripting

Save page source, including javascript

I need to get the source code of a webpage. I have tried to use wget and curl, but it doesn't show the necessary javascript part of the source. I don't have to execute it, only to view the source. How do I do that? (1 Reply)
Discussion started by: locoroco
1 Replies

4. Shell Programming and Scripting

Get Permissions and save to data

Hi all; I have the following code which gives me kind of what I need: #!/usr/bin/perl use Fcntl ':mode'; # if ($ARGV ne "") { $filename = $ARGV; } else { print "Please specify a file!\n"; exit; } # if... (2 Replies)
Discussion started by: gvolpini
2 Replies

5. Shell Programming and Scripting

script for adding page number before page breaks

Hi, If there is an expert that can help: I have many txt files that are produced from pdftotext that include page breaks the page breaks seem to be unix style hex 0C. I want to add page numbers before each page break as in : Page XXXX Regards antman (9 Replies)
Discussion started by: antman
9 Replies

6. Shell Programming and Scripting

Open Page and save it using mozilla

HI Guys, I have one command which can open page and i want to save and exit from it. pf@home> mozilla 181.131.193.10/g/report.txt It will open one page now how can i save it. Thanks (1 Reply)
Discussion started by: pareshkp
1 Replies

7. Shell Programming and Scripting

Scrape 10 million pages and save the raw html data in mysql database

I have a list of 10 million page urls. I want those pages scraped and saved in the mysql database as raw html. I own a Linux VPS server with 1GB RAM and WHM/cPanel. I would like to scrape at least 100,000 urls in 24 hours. So can anyone give me some sample shell scripting code? (4 Replies)
Discussion started by: Viruthagiri
4 Replies

8. UNIX for Dummies Questions & Answers

Get a data and save

If I have a A.log 1 Air Flow Monitor : 34.070 Degrees C 2 Air Flow Monitor : 41.730 Degrees C 3 Air Flow Monitor : 35.340 Degrees C 4 Air Flow Monitor : 33.370 Degrees C 5 Air Flow Monitor : 36.770 Degrees C 6 Air Flow Monitor : 45.910 Degrees C 7 Air Flow Monitor ... (1 Reply)
Discussion started by: sabercats
1 Replies

9. Shell Programming and Scripting

Run sql query in shell script and output data save as delimited text

I want to run sql query in shell script and output data save as delimited text (delimited text would be comma) Code: SPOOL_FILE=/pgedw/dan.txt SQL=/pgedw/dan.sql sqlplus -s username/password@myhost:port/servicename <<EOF set head on set COLSEP , set linesize 32767 SET TRIMSPOOL ON SET... (8 Replies)
Discussion started by: Jaganjag
8 Replies
WWW::Search::AltaVista::AdvancedWeb(3pm)		User Contributed Perl Documentation		  WWW::Search::AltaVista::AdvancedWeb(3pm)

NAME
WWW::Search::AltaVista::AdvancedWeb - class for advanced Alta Vista web searching SYNOPSIS
use WWW::Search; my $search = new WWW::Search('AltaVista::AdvancedWeb'); $search->native_query(WWW::Search::escape_query('(bmw AND mercedes) AND NOT (used OR Ferrari)')); $search->maximum_to_retrieve('100'); while (my $result = $search->next_result()) { print $result->url, " "; } DESCRIPTION
Class hack for Advance AltaVista web search mode originally written by John Heidemann http://www.altavista.com. This hack now allows for AltaVista AdvanceWeb search results to be sorted and relevant results returned first. Initially, this class had skiped the 'r' option which is used by AltaVista to sort search results for relevancy. Sending advance query using the 'q' option resulted in random returned search results which made it impossible to view best scored results first. This class exports no public interface; all interaction should be done through WWW::Search objects. HELP
Use AND to join two terms that must both be present for a document to count as a match. Use OR to join two terms if either one counts. Use AND NOT to join two terms if the first must be present and the second must NOT. Use NEAR to join two terms if they both must appear and be within 10 words of each other. Try this example: cars AND bmw AND mercedes You don't have to capitalize the "operators" AND, OR, AND NOT, or NEAR. But many people do to make it clear what is a query term and what is an instruction to the search engine. One other wrinkle that's very handy: you can group steps together with parentheses to tell the system what order you want it to perform operations in. (bmw AND mercedes) NEAR cars AND NOT (used OR Ferrari) Keep in mind that grouping should be used as much as possible because if you attempt to enter a long query using AND to join the words you may not receive any results because the entire query would be like one long phrase. For best reuslts follow the example herein. AUTHOR
"WWW::Search" hack by Jim Smyser, <jsmyser@bigfoot.com>. COPYRIGHT
Copyright (c) 1996 University of Southern California. All rights reserved. Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. VERSION HISTORY
2.07 - unescape URLs, and bugfix for undefined $hit 2.06 - do not use URI::URL 2.02 - Added HELP POD. Misc. Clean-up for latest changes. 2.01 - Additional query modifiers added for even better results. 2.0 - Minor change to set lowercase Boolean operators to uppercase. 1.9 - First hack version release. native_setup_search This private method does the heavy lifting after native_query() is called. perl v5.12.4 2011-11-02 WWW::Search::AltaVista::AdvancedWeb(3pm)
All times are GMT -4. The time now is 01:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy