awk script to search an html file and output links


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk script to search an html file and output links
# 1  
Old 05-09-2008
awk script to search an html file and output links

hello. i want to make an awk script to search an html file and output all the links (e.g .html, .htm, .jpg, .doc, .pdf, etc..) inside it. also, i want the links that will be output to be split into 3 groups (separated by an empty line), the first group with links to other webpages (.html .htm etc), the second group with links to images (.jpg .jpeg) and the third group with links to .pdf .doc or other downloadable files. and next to each link i want to output how many times each one occurs in the html file.

please, any help would be greatly appreciated!!

kyris
# 2  
Old 05-09-2008
To make a script with awk you must have at least some knowledge of awk, do you?
What have you done to attempt to solve this problem yourself?
Post your sample script, and we'll see how we can assist.

Regards
# 3  
Old 05-09-2008
I too want some help with the AWK command. I want to output data in a similar way.

I want to have a search using the month and year to output numerical data.

The knowledge i have so far on the matter is:

$awk '{print $1, $2, $3, $4 }' hits


the result of this is:

123.45.6.7 NOV 2006 1805GMT

now for me this code will print out the two colums i want but i need to figure out a way to create the search criteria for these columns.

the search criteria needs to be the month and date and needs to be entered either numerically or with text.

any ideas?
# 4  
Old 05-09-2008
amatuer_lee_3,

Please don't hijack another one's thread but start your own thread if you have a question.

Regards
# 5  
Old 05-09-2008
no i haven't done much. i only know a few things... actually i have just thought about declaring a FS or a RS with something like FS="< >" and then search within the fields for /http/ or something and then for /html/. But i don't know a lot of things so i just want to do the basics..

thanks for any help..!
# 6  
Old 05-10-2008
you could happily do that with awk, sed, perl

but its difficult to maintain and make it scalable

rather i would suggest existing CPAN modules like

HTML::LinkExtractor
HTML::LinkExtractor - Extract links from an HTML document - search.cpan.org

Some gurus had already written those Smilie we don't have to reinvent the wheel ( am not lazy Smilie )
# 7  
Old 05-10-2008
the thing is i need it specifically in awk not perl..!:/
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Html output with awk/sendmail

Hello All, I inherited maintenance of a script that uses awk on an input file (space delimited) and formats it into an html table and send out using sendmail. I know how to manipulate the print statements to add columns and rows, however that is the extent of my html knowlege. I've searched... (5 Replies)
Discussion started by: Big-J
5 Replies

2. Shell Programming and Scripting

Convert shell script output txt file to html table

My concnern related to the post -Convert shell script output txt file to html table, in this how to print the heading as color. awk 'BEGIN{print "<table>"} {print "<tr>";for(i=1;i<=NF;i++)print "<td>" $i"</td>";print "</tr>"} END{print "</table>"}' <filename> (8 Replies)
Discussion started by: sarajobmai
8 Replies

3. Shell Programming and Scripting

awk script to search output for a value and print

GOODNUMBERS="1 2 3 4 5 6 3 3 34 34 5 66 12" BADNUMBERS="7 3 12 5 66" for eachnum in `echo ${GOODNUMBERS}` do echo ${BADNUMBERS} | gawk -v threshold=${eachnum} '$1 != threshold' done what im trying to do with the above is, i want to print numbers that are in the GOODNUMBERS... (10 Replies)
Discussion started by: SkySmart
10 Replies

4. UNIX for Dummies Questions & Answers

Output as a html file

Hi i want to store output in html file.but the problem is the html file already has its content..i want to append the new output to that html file..please suggest any new ideas...i m tryin to store the output in a textfile and append to that html file..but its nt workin..need ur help while... (6 Replies)
Discussion started by: navsan
6 Replies

5. Shell Programming and Scripting

Script To Generate HTML output

Hello All, I need help here with a script. I have a script here which generates a html output with set of commands and is working fine. Now i want to add a new command/function which would run on all the remote blades and output should be included in this html file. Here is the script ... (2 Replies)
Discussion started by: Siddheshk
2 Replies

6. Shell Programming and Scripting

Awk script to run a sql and print the output to an output file

Hi All, I have around 900 Select Sql's which I would like to run in an awk script and print the output of those sql's in an txt file. Can you anyone pls let me know how do I do it and execute the awk script? Thanks. (4 Replies)
Discussion started by: adept
4 Replies

7. Shell Programming and Scripting

Convert shell script output txt file to html table

Hi, I have script which generates the output as below: Jobname Date Time Status abc 12/9/11 17:00 Completed xyz 13/9/11 21:00 Running I have the output as a text file. I need to convert it into a HTML Table and sent it thru email ... (6 Replies)
Discussion started by: a12ka4
6 Replies

8. Shell Programming and Scripting

Using awk to when reading a file to search and output to file

Hi, I am not sure if this will work or not. I am getting a syntax error. I am reading fileA, using an acct number field trying to see if it exists in fileB and output to new file. Can anyone tell me if what I am doing will work or should I attempt it another way? Thanks. exec < "${fileA}... (4 Replies)
Discussion started by: ski
4 Replies

9. Shell Programming and Scripting

script to output curl result as html

hi, new to scripting and would like to know how can I have a script which will curl a few URLs and have the results such as the URLs being curled, dns lookup time, connection time, total time, etc save in a html format in a form of table with column and rows. thank you. (4 Replies)
Discussion started by: squidusr
4 Replies

10. Shell Programming and Scripting

have a shell script done in pl/sql and want output in html

I have this shell script where I have both pl/sql and sql. But want to have a snigle output file where the result of each cursors are in HTML tables. I was able to do that on my old script but it was only sql scripts (no pl/sql). Can I do have such outputs now with my new script where I... (2 Replies)
Discussion started by: arobert
2 Replies
Login or Register to Ask a Question