Sponsored Content
Full Discussion: URL partial matching
Top Forums Shell Programming and Scripting URL partial matching Post 302911640 by Akshay Hegde on Saturday 2nd of August 2014 01:50:37 AM
Old 08-02-2014
Code:
akshay@nio:/tmp$ cat file1
http://www.hello.com        http://neo.com/peace/development.html, www.japan.com,  http://example.com/abc/abc.html
http://news.net             http://lolz.com/country/list.html,www.telecom.net, www.highlands.net, www.software.com
http://example2.com         http://earth.net, http://abc.gov.cn/department/1.html

Code:
akshay@nio:/tmp$ cat file2
www.neo.com/1/2/3/names.html
http://abc.gov.cn/script.aspx
http://example.com/abc/abc.html

Code:
akshay@nio:/tmp$ cat cmp_url.awk 
function host(s){
	gsub(/^(http|https):\/\//,"",s)
	gsub(/\/.*|[[:space:]]+|www\./,"",s)
	return s
}
FNR==NR{
	HOSTS_IN_FILE2[host($0)]
	next
}
NF{
	gsub(/,/," "); str = ""
	for(i=2;i<=NF;i++)
	{
		if( host($i) in HOSTS_IN_FILE2 )
		{
			str = length(str) ? str "," $i : $i
		}
	}
	print $1 ( length(str)? OFS str : "" )
	
}

Resulting
Code:
akshay@nio:/tmp$ awk -vOFS="\t" -f cmp_url.awk file2 file1
http://www.hello.com	http://neo.com/peace/development.html,http://example.com/abc/abc.html
http://news.net
http://example2.com	http://abc.gov.cn/department/1.html

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grep all files matching partial filename

What would be the easiest way to grep all files within a particular directory that match a partial filename? For example, searching all files that begin with "filename.txt" and are appended with the date they were created. I am using Ksh 88, btw. (3 Replies)
Discussion started by: mharley
3 Replies

2. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

3. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

4. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

5. UNIX for Dummies Questions & Answers

Matching A URL pattern

egrep -iow '(http*+|www)*' url.txt is this command logically incorrect to match a url pattern inside a file and display only the urls in the terminal??? Please rectify the error in my syntax , (2 Replies)
Discussion started by: an2up
2 Replies

6. Shell Programming and Scripting

AWK - Print partial line/partial field

Hello, this is probably a simple request but I've been toying with it for a while. I have a large list of devices and commands that were run with a script, now I have lines such as: a-router-hostname-C#show ver I want to print everything up to (and excluding) the # and everything after it... (3 Replies)
Discussion started by: ippy98
3 Replies

7. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

8. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies

9. UNIX for Beginners Questions & Answers

awk to update file with partial matching line in another file and append text

In the awk below I am trying to cp and paste each matching line in f2 to $3 in f1 if $2 of f1 is in the line in f2 somewhere. There will always be a match (usually more then 1) and my actual data is much larger (several hundreds of lines) in both f1 and f2. When the line in f2 is pasted to $3 in... (4 Replies)
Discussion started by: cmccabe
4 Replies

10. UNIX for Beginners Questions & Answers

How to extract the partial matching strings among two files?

I have a two file as shown below, file:1 >Contig_152_415 (REVERSE SENSE) >Contig_152_420 (REVERSE SENSE) >Contig_152_472 (REVERSE SENSE) >Contig_152_484 (REVERSE SENSE) File:2 >Contig_152:49081-49929 ATCGAGCAGCGCCGCGTGCGGTGCACCCTTGTGCAGATCGGGAGTAACCACGCGCACGGC... (2 Replies)
Discussion started by: dineshkumarsrk
2 Replies
Template::Plugin::Clickable(3pm)			User Contributed Perl Documentation			  Template::Plugin::Clickable(3pm)

NAME
Template::Plugin::Clickable - Make URLs clickable in HTML SYNOPSIS
[% USE Clickable %] [% FILTER clickable %] URL is http://www.tt2.org/ [% END %] this will become: URL is <a href="http://www.tt2.org/">http://www.tt2.org/</a> DESCRIPTION
Template::Plugin::Clickable is a plugin for TT, which allows you to filter HTMLs clickable. OPTIONS
target [% FILTER clickable target => '_blank' %] [% message.body | html %] [% END %] "target" option enables you to set target attribute in A links. none by default. finder_class "finder_class" option enables you to set other URI finder class rather than URI::Find (default). For example, [% FILTER clickable finder_class => 'URI::Find::Schemeless' %] Visit www.example.com/join right now! [% END %] this will become: Visit <a href="http://www.example.com/join">www.example.com/join</a> right now! NOTE
If you use this module with "html" filter, you should be careful not to break tags or brackets around the URLs. For example if you have a following URL form, <http://www.example.com/> Clickable plugin will filter this into: <a href="http://www.example.com/"><http://www.example.com/></a> which is bad for HTML viewing. However, if you HTML filter them first and then clickable filter, you'll get: &lt;<a href="http://www.example.com/&gt">http://www.example.com/&gt</a>; which href part is wrong. You'd better try Template::Plugin::TagRescue in this case. [% USE Clickable -%] [% USE TagRescue -%] [% FILTER html_except_for('a') -%] [% FILTER clickable -%] <http://www.example.com/> [%- END %] [%- END %] will give you the right format. AUTHOR
Tatsuhiko Miyagawa <miyagawa@bulknews.net> This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO
Template, URI::Find, Template::Plugin::TagRescue perl v5.8.8 2006-11-23 Template::Plugin::Clickable(3pm)
All times are GMT -4. The time now is 08:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy