Sponsored Content
Top Forums Shell Programming and Scripting How to extract url from html page? Post 302463348 by Neo on Sunday 17th of October 2010 03:55:00 AM
Old 10-17-2010
I used to use Regex Buddy (to create and test regex) for this. They had some stock regex that was quite good for extracting URLs from text. This is really a great tool but sadly only runs on Windows (and on Linux using Wine), as I recall. Using the tool, you create, test and debug complex regex. You can even optimize the regex for performance. Then, you cut-and-paste the regex into your code or application. I highly recommend this tool. I would be running it now, but sadly my XP machine died and I'm running OSX on the desktop and only Android on the go.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to get the page size (of a url) using wget

Hi , I am trying to get page size of a url(e.g.,www.example.com) using wget command.Any thoughts what are the parameters i need to send with wget to get the size alone? Regards, Raj (1 Reply)
Discussion started by: rajbal
1 Replies

2. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

3. Solaris

Accessing a HTML page

Hi All, In our unix server we have an apache web server running. I can access the default apache web page from my windows machine. Now, I want to create my own webpage. Therefore I created webpage at /export/home/myname/test.html file. Where do I need to place this file and what do I need... (0 Replies)
Discussion started by: pkm_oec
0 Replies

4. Web Development

findstr in html page

I am planning to create an html page that will count number of connected ports, challenge for me is how to put it in a page. Thanks! (1 Reply)
Discussion started by: webmunkey23
1 Replies

5. UNIX for Dummies Questions & Answers

Publishing HTML Page

Hi All, Thanks for reading. I am not sure if I am asking this in the correct group. But here it goes: There is a shell script which does some system checks and creates an html file called system_summary.html on my Red Hat machine say in /reports directory every hour. Now I want to view it... (1 Reply)
Discussion started by: deepakgang
1 Replies

6. Red Hat

Publishing HTML Page

Hi All, Thanks for reading. I am not sure if I am asking this in the correct group. But here it goes: There is a shell script which does some system checks and creates an html file called system_summary.html on my Red Hat machine say in /reports directory every hour. Now I want to view it... (6 Replies)
Discussion started by: deepakgang
6 Replies

7. Shell Programming and Scripting

Extracting anchor text and its URL from HTML files in BASH

Hi All, I have some HTML files and my requirement is to extract all the anchor text words from the HTML files along with their URLs and store the result in a separate text file separated by space. For example, <a href="/kid/stay_healthy/">Staying Healthy</a> which has /kid/stay_healthy/ as... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

8. Shell Programming and Scripting

URL/HTML encoding

Hey guys, looking for a way to encode a string into URL and HTML in a bash script that I'm making to encode strings in various different digests etc. Can't find anything on it anywhere else on the forums. Any help much appreciated, still very new to bash and programming etc. (4 Replies)
Discussion started by: 3therk1ll
4 Replies

9. Shell Programming and Scripting

Use curl to send a static xml file using url encoding to a web page using pos

Hi I am try to use curl to send a static xml file using url encoding to a web page using post. This has to go through a particular port on our firewall as well. This is my first exposure to curl and am not having much success, so any help you can supply, or point me in the right direction would be... (1 Reply)
Discussion started by: Paul Walker
1 Replies

10. Post Here to Contact Site Administrators and Moderators

Page Not Found error while parsing url

Hi I just tried to post following link while answering, its not parsing properly, just try on your browser Tried to paste while answering : https://www.unix.com/302873559-post2.htmlNot operator is not coming with HTML/PHP tags so attaching file (2 Replies)
Discussion started by: Akshay Hegde
2 Replies
Graphics::Primitive::Component(3pm)			User Contributed Perl Documentation		       Graphics::Primitive::Component(3pm)

NAME
Graphics::Primitive::Component - Base graphical unit DESCRIPTION
A Component is an entity with a graphical representation. SYNOPSIS
my $c = Graphics::Primitive::Component->new({ origin => Geometry::Primitive::Point->new({ x => $x, y => $y }), width => 500, height => 350 }); LIFECYCLE
prepare Most components do the majority of their setup in the prepare. The goal of prepare is to establish it's minimum height and width so that it can be properly positioned by a layout manager. $driver->prepare($comp); layout This is not a method of Component, but a phase introduced by the use of Layout::Manager. If the component is a container then each of it's child components (even the containers) will be positioned according to the minimum height and width determined during prepare. Different layout manager implementations have different rules, so consult the documentation for each for details. After this phase has completed the origin, height and width should be set for all components. $lm->do_layout($comp); finalize This final phase provides and opportunity for the component to do any final changes to it's internals before being passed to a driver for drawing. An example might be a component that draws a fleuron at it's extremities. Since the final height and width isn't known until this phase, it was impossible for it to position these internal components until now. It may even defer creation of this components until now. It is not ok to defer all action to the finalize phase. If you do not establish a minimum hieght and width during prepare then the layout manager may not provide you with enough space to draw. $driver->finalize($comp); draw Handled by Graphics::Primitive::Driver. $driver->draw($comp); METHODS
Constructor new Creates a new Component. Instance Methods background_color Set this component's background color. border Set this component's border, which should be an instance of Border. callback Optional callback that is fired at the beginning of the "finalize" phase. This allows you to add some sort of custom code that can modify the component just before it is rendered. The only argument is the component itself. Note that changing the position or the dimensions of the component will not re-layout the scene. You may have weird results of you manipulate the component's dimensions here. class Set/Get this component's class, which is an abitrary string. Graphics::Primitive has no internal use for this attribute but provides it for outside use. color Set this component's foreground color. fire_callback Method to execute this component's "callback". get_tree Get a tree for this component. Since components are -- by definiton -- leaf nodes, this tree will only have the one member at it's root. has_callback Predicate that tells if this component has a "callback". height Set this component's height. inside_bounding_box Returns a Rectangle that defines the edges of the 'inside' box for this component. This box is relative to the origin of the component. inside_height Get the height available in this container after taking away space for padding, margin and borders. inside_width Get the width available in this container after taking away space for padding, margin and borders. margins Set this component's margins, which should be an instance of Insets. Margins are the space outside the component's bounding box, as in CSS. The margins should be outside the border. maximum_height Set/Get this component's maximum height. Used to inform a layout manager. maximum_width Set/Get this component's maximum width. Used to inform a layout manager. minimum_height Set/Get this component's minimum height. Used to inform a layout manager. minimum_inside_height Get the minimum height available in this container after taking away space for padding, margin and borders. minimum_inside_width Get the minimum width available in this container after taking away space for padding, margin and borders. minimum_width Set/Get this component's minimum width. Used to inform a layout manager. name Set this component's name. This is not required, but may inform consumers of a component. Pay attention to that library's documentation. origin Set/Get the origin point for this component. outside_height Get the height consumed by padding, margin and borders. outside_width Get the width consumed by padding, margin and borders. finalize Method provided to give component one last opportunity to put it's contents into the provided space. Called after prepare. padding Set this component's padding, which should be an instance of Insets. Padding is the space inside the component's bounding box, as in CSS. This padding should be between the border and the component's content. page If true then this component represents stand-alone page. This informs the driver that this component (and any children) are to be renderered on a single surface. This only really makes sense in formats that have pages such as PDF of PostScript. prepare Method to prepare this component for drawing. This is an empty sub and is meant to be overridden by a specific implementation. preferred_height Set/Get this component's preferred height. Used to inform a layout manager. preferred_width Set/Get this component's preferred width. Used to inform a layout manager. to_string Get a string representation of this component in the form of: $name $x,$y ($widthx$height) visible Set/Get this component's visible flag. width Set/Get this component's width. AUTHOR
Cory Watson, "<gphat@cpan.org>" BUGS
Please report any bugs or feature requests to "bug-geometry-primitive at rt.cpan.org", or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Geometry-Primitive <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Geometry-Primitive>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. COPYRIGHT &; LICENSE Copyright 2008-2009 by Cory G Watson. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.12.3 2011-06-02 Graphics::Primitive::Component(3pm)
All times are GMT -4. The time now is 08:40 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy