05-10-2010
i am using a c++ html parser to extracr links from the web pages.
but there are many abnormal url in the results.
fro exampel:
http://百度:http://www.g.cn
or
http://123/a.html
---------- Post updated at 11:41 AM ---------- Previous update was at 11:40 AM ----------
i am using a c++ html parser to extracr links from the web pages.
but there are many abnormal urls in the results.
for example:
http://百度:http://www.g.cn
or
http://123/a.html
9 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
afternoon forums.
I need to get a way of testing as to wether an inputed character is part of the english alphabet.
i have come up with the following code but its not working at all.
until ']
do
echo This is not a Letter
done
any help would be beneficial to me. (1 Reply)
Discussion started by: strasner
1 Replies
2. UNIX for Dummies Questions & Answers
Hi,
i have a variable which holds a variety of letters. eg,
var=qwertyuiop
what i want to do is determine wether an inputed letter is already stored inside the variable, so i can say to enter a new one.
i have been playing around using tr and grep but nothing seems to work at all.
... (2 Replies)
Discussion started by: castillo
2 Replies
3. Shell Programming and Scripting
Trying to run the following awk command :
export com.mics.ara.server.tools.sch_reports.Runner.num_threads=`awk -F= '!/^#/ && /com.mics.ara.server.tools.sch_reports.Runner.num_threads/{print $2}' $BKUPDIR/env.properties`
-bash: export:... (6 Replies)
Discussion started by: venhart
6 Replies
4. UNIX for Advanced & Expert Users
Hi, all:
How can I check what happen with my own NIC driver which response "successful" when local PC "ping" a remote linux PC but "unreachable" when it "ping" a remote windows XP PC? My writed driver runs in linux 3.0.4 kernel.
thanks!
li, kunlun (1 Reply)
Discussion started by: liklstar
1 Replies
5. UNIX for Dummies Questions & Answers
Here is what I have so far:
find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}'
Here is an example content of a PHP or HTM(HTML) file:
<iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies
6. UNIX for Dummies Questions & Answers
The challenge:
Decode URL's, i.e. convert %HEX to the corresponding special characters, using only UNIX base utilities, and without having to type out each special character.
I have an anonymous C code snippet where the author assigns each hex digit a number from 0 to 16 and then does some... (2 Replies)
Discussion started by: uiop44
2 Replies
7. Shell Programming and Scripting
Hi Forum.
I tried to search the forum posts for an answer but I haven't been able to do so for what I'm trying to accomplish.
I have the following source file:
11936385~TFSA|11936385|4431|3401458067|10/09/1982|25.00|IBSBONUS|3200|||||CASH|
3401458067|1005|... (3 Replies)
Discussion started by: pchang
3 Replies
8. Shell Programming and Scripting
Wget Error Codes:
0 No problems occurred.
1 Generic error code.
2 Parse error—for instance, when parsing command-line options, the .wgetrc or .netrc…
3 File I/O error.
4 Network failure.
5 SSL verification failure.
6 Username/password authentication failure.
... (3 Replies)
Discussion started by: mohtashims
3 Replies
9. UNIX for Beginners Questions & Answers
Hi guys,
What is the regex to check for only valid URL from a file using grep? (2 Replies)
Discussion started by: Meeran Rizvi
2 Replies
LEARN ABOUT REDHAT
uri::url
URI::URL(3) User Contributed Perl Documentation URI::URL(3)
NAME
URI::URL - Uniform Resource Locators
SYNOPSIS
$u1 = URI::URL->new($str, $base);
$u2 = $u1->abs;
DESCRIPTION
This module is provided for backwards compatibility with modules that depend on the interface provided by the "URI::URL" class that used to
be distributed with the libwww-perl library.
The following differences compared to the "URI" class interface exist:
o The URI::URL module exports the url() function as an alternate constructor interface.
o The constructor takes an optional $base argument. The "URI::URL" class is a subclasses of "URI::WithBase".
o The URI::URL->newlocal class method is the same as URI::file->new_abs
o URI::URL::strict(1)
o $url->print_on method
o $url->crack method
o $url->full_path; same as ($uri->abs_path || "/")
o $url->netloc; same as $uri->authority
o $url->epath, $url->equery; same as $uri->path, $uri->query
o $url->path and $url->query pass unescaped strings.
o $url->path_components; same as $uri->path_segments (if you don't consider path segment parameters).
o $url->params and $url->eparams methods.
o $url->base method. See URI::WithBase.
o $url->abs and $url->rel have an optional $base argument. See URI::WithBase.
o $url->frag; same as $uri->fragment
o $url->keywords; same as $uri->query_keywords;
o $url->localpath with friends map to $uri->file
o $url->address and $url->encoded822addr; same as $uri->to for mailto URI.
o $url->groupart method for news URI.
o $url->article; same as $uri->message
SEE ALSO
URI, URI::WithBase
COPYRIGHT
Copyright 1998-2000 Gisle Aas.
perl v5.8.0 2002-05-09 URI::URL(3)