Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Awk: print all URL addresses between iframe tags without repeating an already printed URL Post 302602851 by striker4o on Tuesday 28th of February 2012 01:56:16 PM
Old 02-28-2012
Quote:
Originally Posted by Scrutinizer
Try:
Code:
awk '/http/' RS=\" infile

That never occurred to me. Thanks, it works very well. However, the output is:

Code:
http://ADDRESS_1/?click=5BBB08\
http://ADDRESS_2/?click=5BBB08\
http://ADDRESS_3/?click=5BBB08\
http://ADDRESS_4/?click=5BBB08\
http://ADDRESS_5/?click=5BBB08\
http://ADDRESS_6/?click=5BBB08\
http://ADDRESS_7/?click=5BBB08\
http://ADDRESS_6/?click=5BBB08\
http://ADDRESS_7/?click=5BBB08\

Now I only need to get rid of the duplicate entries.

I did try sort -u and got what I wanted.

So final:

Code:
awk '/http/' RS=\" infile | sort -u


Last edited by striker4o; 02-28-2012 at 03:03 PM.. Reason: Nevermind, I made it. Thanks for help.
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

2. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

3. UNIX for Dummies Questions & Answers

ReDirecting a URL to another URL - Linux

Hello, I need to redirect an existing URL, how can i do that? There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this? This is on Unix boxes Linux. example: https://m45.testing.address.net/host.php make it so the... (3 Replies)
Discussion started by: SkySmart
3 Replies

4. UNIX for Advanced & Expert Users

Need to grab URL and place between <A></A> Tags

my output looks like: <A HREF="http://support.apple.com/kb/HT1629"> </A> <A HREF="http://support.apple.com/kb/HT1200"> </A> <A HREF="http://old.nabble.com/AFP-eating-up-CPU-td19976358.html"> </A> <A HREF="http://jochsner.dyndns.org/scripts/NHR.html"> </A> <A... (3 Replies)
Discussion started by: glev2005
3 Replies

5. Shell Programming and Scripting

how to judge wether a url is valid or not using awk

rt 3ks:confused: (6 Replies)
Discussion started by: rainboisterous
6 Replies

6. Shell Programming and Scripting

Extract URL from RSS Feed in AWK

Hi, I have following data file; <outline title="Matt Cutts" type="rss" version="RSS" xmlUrl="http://www.mattcutts.com/blog/feed/" htmlUrl="http://www.mattcutts.com/blog"/> <outline title="Stone" text="Stone" type="rss" version="RSS" xmlUrl="http://feeds.feedburner.com/STC-Art"... (8 Replies)
Discussion started by: fahdmirza
8 Replies

7. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

8. UNIX for Dummies Questions & Answers

URL decoding with awk

The challenge: Decode URL's, i.e. convert %HEX to the corresponding special characters, using only UNIX base utilities, and without having to type out each special character. I have an anonymous C code snippet where the author assigns each hex digit a number from 0 to 16 and then does some... (2 Replies)
Discussion started by: uiop44
2 Replies

9. Shell Programming and Scripting

awk and or sed command to sum the value in repeating tags in a XML

I have a XML in which <Amt Ccy="EUR">3.1</Amt> tag repeats. This is under another tag <Main>. I need to sum all the values of <Amt Ccy=""> (Ccy may vary) coming under <Main> using awk and or sed command. can some help? Sample looks like below <root> <Main> ... (6 Replies)
Discussion started by: bk_12345
6 Replies

10. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies
Test::WWW::Declare(3pm) 				User Contributed Perl Documentation				   Test::WWW::Declare(3pm)

NAME
Test::WWW::Declare - declarative testing for your web app SYNOPSIS
use Test::WWW::Declare tests => 3; use Your::Web::App::Test; Your::Web::App::Test->start_server; session 'testuser' => run { flow 'log in and out' => check { flow 'log in' => check { get 'http://localhost/'; fill form 'login' => { username => 'testuser', password => 'drowssap', }; content should contain 'log out'; }; flow 'log out' => check { get 'http://localhost/'; click href 'log out'; }; }; }; DESCRIPTION
Often in web apps, tests are very dependent on the state set up by previous tests. If one test fails (e.g. "follow the link to the admin page") then it's likely there will be many more failures. This module aims to alleviate this problem, as well as provide a nicer interface to Test::WWW::Mechanize. The central idea is that of "flow". Each flow is a sequence of commands ("fill in this form") and assertions ("content should contain 'testuser'"). If any of these commands or assertions fail then the flow is aborted. Only that one failure is reported to the test harness and user. Flows may also contain other flows. If an inner flow fails, then the outer flow fails as well. FLOWS AND SESSIONS
session NAME => run { CODE } Sessions are a way of associating a set of flows with a WWW::Mechanize instance. A session is mostly equivalent with a user interacting with your web app. Within a session, every command ("get", "click link", etc) is operating on that session's WWW::Mechanize instance. You may have multiple sessions in one test file. Two sessions with the same name are in fact the same session. This lets you write code like the following, simplified slightly: session 'first user' => run { get "$URL/give?task=1&victim=other"; session 'other user' => run { get "$URL/tasks"; content should match qr/task 1/; # this is the same session/mech as the outermost 'first user' session 'first user' => run { get "$URL/tasks"; content shouldnt match qr/task 1/; }; }; }; flow NAME => check { CODE } A flow encompasses a single test. As described above, each flow is a sequence of commands, assertions, and other flows. If any of the components of a flow fail, the rest of the flow is aborted and one or more test failures are reported to the test harness. COMMANDS
get URL click button click href follow_link fill form NAME => {FIELD1 => VALUE1, FIELD2 => VALUE2} ASSERTIONS
Every assertion has two parts: a subject and a verb. SUBJECTS content title url VERBS should(nt) (caselessly) match REGEX should(nt) (caselessly) contain STRING should(nt) (caselessly) lack STRING should(nt) (caselessly) equal STRING SUBCLASSING
One of the goals of this module is to let you subclass it to provide extra features, such as automatically logging in a user each time a session is created. CAVEATS
If you fail any tests, then the actual number of tests run may be fewer than you have in your file. This is because when a flow fails, it immediately aborts the rest of its body (which may include other flows). So if you're setting the number of tests based on how many ran, make sure that all tests passed. BUGS
Hopefully few. We'd like to know about any of them. Please report them to "bug-test-www-declare@rt.cpan.org". SEE ALSO
Test::WWW::Mechanize, Jifty. MAINTAINER
Shawn M Moore "<sartak@bestpractical.com>" ORIGINAL AUTHOR
Jesse Vincent "<jesse@bestpractical.com>" COPYRIGHT
Copyright 2007-2008 Best Practical Solutions, LLC This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2008-10-12 Test::WWW::Declare(3pm)
All times are GMT -4. The time now is 02:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy