Awk scripting and usage of regex to locate a hyperlink
Hello guys,
I need to write awk script that would take an html page and output a list of each unique http link on that webpage followed by the number of times it occurred in that file.
e.g.
-----------------------------------------
Webpage: index.html
To do that I'm thinking of using regular expressions.
I'm using the following regex to find a hyper link in the html file.
It outputs the whole line that contains the link. Say we have the following html code:
--------------------------------------------
<html>
<p> Here is some text before the link, the <a href = "www.google.com"> link </a> Some text after the link
</html>
--------------------------------------------
The output will be:
--------------------------------------------
Here is some text before the link, the <a href = "www.google.com"> link </a> Some text after the link
--------------------------------------------
What i need is to somehow get rid of all unnecessary output leaving the target url of a link and nothing else. So that the output would be:
I've tried using the following, however if the are several links on a line only the first link is found:
{ start = index($0, "<a")
end = index($0,"\">")
len = end - start
print substr($0,start,len) }
in shell scripting there is extensive usage of
i> regular expression
ii>sed
iii>awk
can anyone tell me the suitable contexts ...i mean which one is suitable for what kind of operation.
like the reg-exp and sed seems to be doing the same job..i.e pattern matching (1 Reply)
I am trying to write a small (and rather simple) script to gather some info about the system and piping it to dzen2
first, i want to explain some things.
I know i could have used conky, but my intention was to expand my knowledge of bash, pipes and redirections inside a script, and to have fun... (14 Replies)
how can I find cpu usage memory usage swap usage and
I want to know CPU usage above X% and contiue Y times and memory usage above X % and contiue Y times
my final destination is monitor process
logical volume usage above X % and number of Logical voluage above
can I not to... (3 Replies)
Hi
The locate command searches the pattern in all the directories.
How can i make it look in for a specific directory because i know the
directory in which the file exists.
Thanks (1 Reply)
hyper link- abc:8081/xyz/2.5.6/rtyp-2.5.6.jar
Needs to get "rtyp-2.5.6.jar" i.e character after last backslash "/"
how to do this using sed/awk??
help is highly appreciated. (7 Replies)
Hi all,
Can any one explain the usage of EOF in shell scripting??
Gone through some examples from google, but it is not clear...
Examples are:
1.
$ tr << EOF
> abcd
> efgh
> iojk
> EOF
O/P is:
ABCD
EFGH
IOJK
2.
echo << EOF (1 Reply)
# check host value regex='^(||1|2|25)(\.(||1|2|25)){3}$' if ')" != "" ]; then if ]; then echo host $host not found exit 4 fi elif ]; then echo $host is an invalid host address exit 5 fi (1 Reply)
Hi
After lot of trial and error I am really bowled out with the requirement in hand and honestly you are my last hope
Here is what I want to achieve
Values
*IF *VALUE MS_SQL_Statistics_Summary.Client_Count_Percent_Used *GT 70.00 *AND *VALUE... (20 Replies)