![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Investigating strange dialup activity with Wireshark | iBot | UNIX and Linux RSS News | 0 | 07-01-2008 12:20 PM |
| man pages in AIX | dlynch912 | AIX | 5 | 10-19-2005 09:04 AM |
| man pages | dangral | UNIX for Dummies Questions & Answers | 4 | 02-04-2003 10:29 PM |
| man pages | bensky | UNIX for Dummies Questions & Answers | 3 | 03-01-2002 06:37 AM |
| Man pages | DPAI | UNIX for Dummies Questions & Answers | 2 | 02-17-2002 09:08 PM |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
||||
|
Investigating web pages in awk
hello. i want to make an awk script to search an html file and output all the links (e.g .html, .htm, .jpg, .doc, .pdf, etc..) inside it. also, i want the links that will be output to be split into 3 groups (separated by an empty line), the first group with links to other webpages (.html .htm etc), the second group with links to images (.jpg .jpeg) and the third group with links to .pdf .doc or other downloadable files. and next to each link i want to output how many times each one occurs in the html file.
(i am only doing the links first, then once I have crakced this i will be able to do the other formats easily) So I have currently got... BEGIN{FS = " "} {for (i=1; i<=NF;i++){if ($i ~ /^href/) {print $i}} } # END{} which prints out the word e.g href="index.html" > , I would like this to just print out...index.html and the number of times it appears in the webpage. Any help/hints on how i could achieve the top paragraph would be a great help. Last edited by adpe; 04-28-2009 at 02:30 PM.. |
| Bookmarks |
| Tags |
| awk, html, parsing html |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|