[lynx dump] Order (by name/URL)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting [lynx dump] Order (by name/URL)
# 1  
Old 04-08-2009
[lynx dump] Order (by name/URL)

Hi Smilie

How to use dump in lynx.
Code:
$ lynx -dump http://www.google.com

So, this is an example of a lynx dump:

Code:
[1] txt1 [2]blabla [3]Other txt
[4] some text
1. http://url_of_txt1
2. http://url_of_blabla
3. http://url_of_Other_txt
4. http://url_of_some_text
...

How can i obtain this output?

Code:
txt1 
http://url_of_txt1

blabla
http://url_of_blabla

Other txt
http://url_of_Other_txt

some text
http://url_of_some_text

... (and so on)

Thanks! Smilie
B.R.
# 2  
Old 04-08-2009
this may not work for bigger logs. feel free to post a bigger log file.
this works as long as the [1] indexes are unique.

Code:
#!perl

$fin = "a";

unless ( open FIN, $fin ){
  print "cannot read file $fin \n";
  exit(9);
  }

$fout = "b";

unless ( open FOUT, ">$fout" ){
  print "cannot write to file $fout \n";
  exit(9);
  }

@a_lines = <FIN>; chomp @a_lines; close FIN;

foreach $line ( @a_lines ){

  next unless ( $line =~ m/[\[\]]/ );

  ( @a_junk ) = split( /[\[]/, $line );
  foreach $junk ( @a_junk ){

    ( $key, @a_more_junk ) = split( /[\]]/, $junk );
    next unless $key;

    $aa_keys{ $key } = join ' ', @a_more_junk;
    }

  }

foreach $line ( @a_lines ){

  next if ( $line =~ m/[\[\]]/ );

  ( $key, @a_junk ) = split /\./, $line;

  printf "$aa_keys{ $key }\n";
  printf "$key.";
  printf join ' ', @a_junk;
  printf "\n\n";

  }

# 3  
Old 04-08-2009
Thanks for your help quirkasaurus Smilie

$ lynx -dump Google > google.com

var change:
#!perl
$fin = "google.com";
....

script output:

Code:
.


.


                                   Google.


.


     _______________________________________________________.


.


.


.


Refers.


.


   1. http://images google com/imghp?hl=en&tab=wi


   2. http://maps google com/maps?hl=en&tab=wl


   3. http://news google com/nwshp?hl=en&tab=wn


   4. http://video google com/?hl=en&tab=wv


   5. http://mail google com/mail/?hl=en&tab=wm


   6. http://www google com/intl/en/options/


   7. http://www google com/url?sa=p&pref=ig&pval=3&q=http://www google com/ig0.000000hl  0en                          ource  0iglk&usg=AFQjCNFA18XPfgb7dKnXfKz7x7g1GDH1tg


   8. https://www google com/accounts/Login?continue=http://www google com/intl/en/&hl=en


   9. http://www google com/advanced_search?hl=en


  10. http://www google com/preferences?hl=en


  11. http://www google com/language_tools?hl=en


  12. http://www google com/intl/en/ads/


  13. http://www google com/services/


  14. http://www google com/intl/en/about html


  15. http://www google com/


  16. http://www google com/intl/en/privacy html

However i must also insert this code inside another bash script... (#!/usr/bin/bash) and i don't know perl :P

Thanks for your help Smilie
# 4  
Old 04-08-2009
...
[edit]

Last edited by aspire; 04-08-2009 at 02:30 PM.. Reason: double post...sorry!
# 5  
Old 04-08-2009
well. . .. the other problem is that it's difficult to do associate arrays or hashes
in languages other than perl. you have to build your own intermediate array.

a total pain . . . . .

i'll give a hint how it's done:

${hash_ref[$num]}=the_alpha_key

${hash_value[$num]}=the_data_for_the_alpha_key

then you peruse the first array, find the subscript, and use it for the second.
not a whole lot of fun in ksh.
# 6  
Old 04-08-2009
kind of ugly, but don't have time to pretty it up:

nawk -f aspire.awk google.com

aspire.awk:
Code:
BEGIN {
}
/^[[]/ {
  while (match($0, "^[[][0-9]*[]]") ) {
     idx=substr($0,RSTART+1, RLENGTH-2) "."
     rem=substr($0,RSTART+RLENGTH)
     match(rem, "[^[]*([[]|$)")
     name=substr(rem, RSTART,RLENGTH-1)
     if (length(rem)==(length(name)+1))
         name=substr(rem, RSTART)
     arr[idx]= name
    $0=substr(rem, RSTART+RLENGTH-1)
  }
  next
}
$1 in arr { print arr[$1] ORS $2 }

# 7  
Old 04-08-2009
Thanks for your help Smilie

@vgersh99

$ chmod +x aspire.awk
$ nawk -f ./aspire.awk google.com
$

Uhm...no otuput... my mistake?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies

2. UNIX for Dummies Questions & Answers

Read URL data from UNIX without wget,curl,lynx,w3m.

Hi Experts, Problem statement : We have an URL for which we need to read the data and get parsed inside the shell scripts. My Aix has very limited perl utility, i cant install any utility as well. Precisely, wget,cURL,Lynx,w3m and Lwp cant be used as i got these utilities only when i googled... (0 Replies)
Discussion started by: scott_cog
0 Replies

3. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

4. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

5. Shell Programming and Scripting

lynx --dump on site that needs username and password??

I'm trying to use lynx --dump to keep an eye on updates for a website. The site needs a username and password and I can't find a way to log in using lynx --dump Any ideas?? Thanks in advance! (12 Replies)
Discussion started by: 64mb
12 Replies

6. UNIX for Dummies Questions & Answers

Trying to make fixtures table with lynx --dump and pipe filters

Hey, I'm trying to make a nice clear table of fixtures. lynx --dump Fixtures & Reports | Fixtures | Arsenal.com | tail -n+360 | less #tail to remove 1st 360 line I'm trying to remove the 'Add to Calendar' bit next I tried pipping through sed but not sure if I did it right sed 's/\Add... (3 Replies)
Discussion started by: 64mb
3 Replies

7. UNIX for Dummies Questions & Answers

ReDirecting a URL to another URL - Linux

Hello, I need to redirect an existing URL, how can i do that? There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this? This is on Unix boxes Linux. example: https://m45.testing.address.net/host.php make it so the... (3 Replies)
Discussion started by: SkySmart
3 Replies

8. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

9. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

10. UNIX for Dummies Questions & Answers

help, what is the difference between core dump and panic dump?

help, what is the difference between core dump and panic dump? (1 Reply)
Discussion started by: aileen
1 Replies
Login or Register to Ask a Question