using Lynx and Grep to return search page rank - help


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers using Lynx and Grep to return search page rank - help
# 1  
Old 09-18-2007
using Lynx and Grep to return search page rank - help

I am writing a script which will read in search terms from a text file and pass each line to Lynx. Lynx will grab the source html, then I want grep/tr, whatever to search for the first occurance of a term (mydomain.name), then delete from that 1st occurance on, creating a new end of file.

Then I want to count a certain marker <class=L> in the remaining source to determine the search engine page rank until end of file.

This is what I have so far. My primary issue is that google returns all search html source as 1 line, which is why I need to count the style tag <class=L> (in this case lowercase L), what I have right now grab the search terms and the results, but I'm unsure of where to go from here.

#!/bin/bash
cat ${1} | while read searchTerm; do
#echo "${searchTerm}"
lynx -source -accept_all_cookies "http://www.google.com/search?q=$searchTerm">> /path/to/dir/archive.txt
done

Thanks in Advance!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. What is on Your Mind?

The Return of the Show Post Page

You may recall we used to have a "Show Post" link in each post that referenced the post and had a link to this post in page. I am going to bring back this feature and and renovate the page: https://www.unix.com/members/1-albums215-picture1013.png So that page has a "Under Renovation"... (1 Reply)
Discussion started by: Neo
1 Replies

2. What is on Your Mind?

Update to Advanced Search Page (Phase 1)

Update: I have completed the first phase of revamping the "Advanced Search" page using Bootstrap (desktop not mobile yet): https://www.unix.com/search.php https://www.unix.com/search.php I may change this to a Bootstrap modal later and change the CSS a bit more; but for now it is much... (0 Replies)
Discussion started by: Neo
0 Replies

3. Web Development

Fix For Google Page Rank: Wordpress List Rank Dashboard Widget

Here is the fix for the recent Google changes to their pagerank API. For example, in the List Rank Dashboard Widget Wordpress Plugin (Version 1.7), in this plugin file: list-rank-dashboard-widget/wp-list-rank-class.php in this function: function getGooglePR($url) Change this line: ... (0 Replies)
Discussion started by: Neo
0 Replies

4. Solaris

How to search man page (pdf file)

I'm not sure is it just only me or something. I try to download man page documentation from SUN.COM. However, it seems I can't search what I looking for in SUN man page. I try to search "passwd" but it return me a word "less" why this pdf can't search or is it require specific plugin to... (3 Replies)
Discussion started by: Smith
3 Replies

5. UNIX for Dummies Questions & Answers

Lynx Grep Pattern Match 2 conditions Print from Start to End

I am working on a scraping project and I am stuck at this tiny grep pattern match. Sample text : FPA List. FPA List. FPA List. FPA List. FPA List. FPA List. FPA List. FPA List. ABC Personal Planning Catherine K. Wat Cath Wat Catherine K. Wat Catherine K. Wat IFRAME:... (8 Replies)
Discussion started by: kkiran
8 Replies

6. UNIX for Dummies Questions & Answers

| help | unix | grep - Can I use grep to return a string with exactly n matches?

Hello, I looking to use grep to return a string with exactly n matches. I'm building off this: ls -aLl /bin | grep '^.\{9\}x' | tr -s ' ' -rwxr-xr-x 1 root root 632816 Nov 25 2008 vi -rwxr-xr-x 1 root root 632816 Nov 25 2008 view -rwxr-xr-x 1 root root 16008 May 25 2008... (7 Replies)
Discussion started by: MykC
7 Replies

7. UNIX for Advanced & Expert Users

Man page search issue

I have an issue with my man page configuration. I can able to see man pages for 1st section. But for not the rest of the sections. But If it give section number, man page is working properly Following are the details echo $MANPATH... (4 Replies)
Discussion started by: praveenkumar_l
4 Replies
Login or Register to Ask a Question
CG(1)																	     CG(1)

NAME
cg - Recursively grep for a pattern and store it. SYNOPSIS
cg [ -l ] | [ [ -i ] pattern [ files ] ] DESCRIPTION
cg does a search though text files (usually source code) recursively for a pattern, storing matches and displaying the output in a human- readable fashion. It is intended to give some of the functionaly of AT&T's cscope(1) tool, with the advantages of simplicity and not being language-specific. The script will colorize output if configured as such. It is typically run with a Perl regular expression to search for. The search can be made case insensitive by using the -i option. A list of files may also be specified with an additional argument after the pattern. Put the files pattern in quotes to make it be matched by Perl rather than by the shell. Running the script with no arguments will recall the results of the previous search. After the search, entries found can be edited using the vg(1) script. The -l option shows the last log made. SOME EXAMPLES
cg - alone recalls the previous search results. cg -i pattern - search the default list of files for all files matching the pattern (and case-insensitively). cg pattern '*.c' - search recursively for pattern in all *.c files. This automatically converts '*' to '.*' and '.' to '.' for you and does a Perl pattern match on all files in the tree. cg pattern *.c - search through the shell-expanded list of *.c files, so not done recursively (in other words, only the files your shell pass to the script as arguments). cg -l - show the last log made. COMMAND-LINE OPTIONS -i Do a case-insensitive search. -l Show the last log made. -p Toggle the default pager option. cg has a bulit-in pager function, which can be enabled or disabled by default (in .cgvgrc). If the default is enabled, this option disables the pager; if the default is disabled, this option enables it. -P Force the built-in pager to be disabled. FILES
${HOME}/.cglast Log file of the last search. ${HOME}/.cgvgrc Per-user configuration file (if the defaults are not desireable). ${HOME}/.cgvg/* Log files in $HOSTNAME.shell_pid form with the log of the last search. SEE ALSO
vg(1), perl(1), find(1), grep(1), cscope(1) AUTHOR
cg was written by Joshua Uziel <uzi@uzix.org>. 13 Mar 2002 CG(1)