Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Awk: print all URL addresses between iframe tags without repeating an already printed URL Post 302602847 by Corona688 on Tuesday 28th of February 2012 01:50:56 PM
Old 02-28-2012
If you're using awk, you don't need grep. awk '/regex/ { print }' filenames is equivalent to grep "regex" filenames

I'm abusing awk's record-separator here, RS, so that it considers each < character to be a "newline". So it will split the two iframes apart itself, and "iframe src=" will always show up at the beginning of the 'line' if present, which I test with a regex /^iframe src=/. If a match is found, it rips the URL out of parameter $2 with a gsub, tests if we've printed it already, and if not, prints it.

Code:
find . -name "*php*" -or -name "*htm*" |
        xargs awk -v RS='<' '
# For all records beginning with 'iframe src':
/^iframe src=/ {
        # get rid of src= and "
        gsub(/(src=)|"/, "", $2);
        # Print only if we haven't seen it before
        if(!X[$2]++) print $2
}'

This User Gave Thanks to Corona688 For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

2. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

3. UNIX for Dummies Questions & Answers

ReDirecting a URL to another URL - Linux

Hello, I need to redirect an existing URL, how can i do that? There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this? This is on Unix boxes Linux. example: https://m45.testing.address.net/host.php make it so the... (3 Replies)
Discussion started by: SkySmart
3 Replies

4. UNIX for Advanced & Expert Users

Need to grab URL and place between <A></A> Tags

my output looks like: <A HREF="http://support.apple.com/kb/HT1629"> </A> <A HREF="http://support.apple.com/kb/HT1200"> </A> <A HREF="http://old.nabble.com/AFP-eating-up-CPU-td19976358.html"> </A> <A HREF="http://jochsner.dyndns.org/scripts/NHR.html"> </A> <A... (3 Replies)
Discussion started by: glev2005
3 Replies

5. Shell Programming and Scripting

how to judge wether a url is valid or not using awk

rt 3ks:confused: (6 Replies)
Discussion started by: rainboisterous
6 Replies

6. Shell Programming and Scripting

Extract URL from RSS Feed in AWK

Hi, I have following data file; <outline title="Matt Cutts" type="rss" version="RSS" xmlUrl="http://www.mattcutts.com/blog/feed/" htmlUrl="http://www.mattcutts.com/blog"/> <outline title="Stone" text="Stone" type="rss" version="RSS" xmlUrl="http://feeds.feedburner.com/STC-Art"... (8 Replies)
Discussion started by: fahdmirza
8 Replies

7. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

8. UNIX for Dummies Questions & Answers

URL decoding with awk

The challenge: Decode URL's, i.e. convert %HEX to the corresponding special characters, using only UNIX base utilities, and without having to type out each special character. I have an anonymous C code snippet where the author assigns each hex digit a number from 0 to 16 and then does some... (2 Replies)
Discussion started by: uiop44
2 Replies

9. Shell Programming and Scripting

awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command. can some help? Sample looks like below <root> <Main> ... (6 Replies)
Discussion started by: bk_12345
6 Replies

10. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies
All times are GMT -4. The time now is 07:01 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy