Sponsored Content
Top Forums Shell Programming and Scripting how to judge wether a url is valid or not using awk Post 302419857 by rainboisterous on Sunday 9th of May 2010 11:41:25 PM
Old 05-10-2010
i am using a c++ html parser to extracr links from the web pages.
but there are many abnormal url in the results.
fro exampel:
http://百度:http://www.g.cn
or
http://123/a.html

---------- Post updated at 11:41 AM ---------- Previous update was at 11:40 AM ----------

i am using a c++ html parser to extracr links from the web pages.
but there are many abnormal urls in the results.
for example:
http://百度:http://www.g.cn
or
http://123/a.html
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

checking wether an input is using letters of the alphabet

afternoon forums. I need to get a way of testing as to wether an inputed character is part of the english alphabet. i have come up with the following code but its not working at all. until '] do echo This is not a Letter done any help would be beneficial to me. (1 Reply)
Discussion started by: strasner
1 Replies

2. UNIX for Dummies Questions & Answers

checking wether an inputed character is already in a variable

Hi, i have a variable which holds a variety of letters. eg, var=qwertyuiop what i want to do is determine wether an inputed letter is already stored inside the variable, so i can say to enter a new one. i have been playing around using tr and grep but nothing seems to work at all. ... (2 Replies)
Discussion started by: castillo
2 Replies

3. Shell Programming and Scripting

awk command - not a valid identifier message

Trying to run the following awk command : export com.mics.ara.server.tools.sch_reports.Runner.num_threads=`awk -F= '!/^#/ && /com.mics.ara.server.tools.sch_reports.Runner.num_threads/{print $2}' $BKUPDIR/env.properties` -bash: export:... (6 Replies)
Discussion started by: venhart
6 Replies

4. UNIX for Advanced & Expert Users

Wether does it successful or unreachable?

Hi, all: How can I check what happen with my own NIC driver which response "successful" when local PC "ping" a remote linux PC but "unreachable" when it "ping" a remote windows XP PC? My writed driver runs in linux 3.0.4 kernel. thanks! li, kunlun (1 Reply)
Discussion started by: liklstar
1 Replies

5. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

6. UNIX for Dummies Questions & Answers

URL decoding with awk

The challenge: Decode URL's, i.e. convert %HEX to the corresponding special characters, using only UNIX base utilities, and without having to type out each special character. I have an anonymous C code snippet where the author assigns each hex digit a number from 0 to 16 and then does some... (2 Replies)
Discussion started by: uiop44
2 Replies

7. Shell Programming and Scripting

Using awk to determine if field value is valid

Hi Forum. I tried to search the forum posts for an answer but I haven't been able to do so for what I'm trying to accomplish. I have the following source file: 11936385~TFSA|11936385|4431|3401458067|10/09/1982|25.00|IBSBONUS|3200|||||CASH| 3401458067|1005|... (3 Replies)
Discussion started by: pchang
3 Replies

8. Shell Programming and Scripting

Wget fails for a valid URL

Wget Error Codes: 0 No problems occurred. 1 Generic error code. 2 Parse error—for instance, when parsing command-line options, the .wgetrc or .netrc… 3 File I/O error. 4 Network failure. 5 SSL verification failure. 6 Username/password authentication failure. ... (3 Replies)
Discussion started by: mohtashims
3 Replies

9. UNIX for Beginners Questions & Answers

Regex for a valid URL

Hi guys, What is the regex to check for only valid URL from a file using grep? (2 Replies)
Discussion started by: Meeran Rizvi
2 Replies
URI::URL(3)						User Contributed Perl Documentation					       URI::URL(3)

NAME
URI::URL - Uniform Resource Locators SYNOPSIS
$u1 = URI::URL->new($str, $base); $u2 = $u1->abs; DESCRIPTION
This module is provided for backwards compatibility with modules that depend on the interface provided by the "URI::URL" class that used to be distributed with the libwww-perl library. The following differences compared to the "URI" class interface exist: o The URI::URL module exports the url() function as an alternate constructor interface. o The constructor takes an optional $base argument. The "URI::URL" class is a subclasses of "URI::WithBase". o The URI::URL->newlocal class method is the same as URI::file->new_abs o URI::URL::strict(1) o $url->print_on method o $url->crack method o $url->full_path; same as ($uri->abs_path || "/") o $url->netloc; same as $uri->authority o $url->epath, $url->equery; same as $uri->path, $uri->query o $url->path and $url->query pass unescaped strings. o $url->path_components; same as $uri->path_segments (if you don't consider path segment parameters). o $url->params and $url->eparams methods. o $url->base method. See URI::WithBase. o $url->abs and $url->rel have an optional $base argument. See URI::WithBase. o $url->frag; same as $uri->fragment o $url->keywords; same as $uri->query_keywords; o $url->localpath with friends map to $uri->file o $url->address and $url->encoded822addr; same as $uri->to for mailto URI. o $url->groupart method for news URI. o $url->article; same as $uri->message SEE ALSO
URI, URI::WithBase COPYRIGHT
Copyright 1998-2000 Gisle Aas. perl v5.8.0 2002-05-09 URI::URL(3)
All times are GMT -4. The time now is 06:53 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy