Sponsored Content
Full Discussion: Matching A URL pattern
Top Forums UNIX for Dummies Questions & Answers Matching A URL pattern Post 302507506 by an2up on Thursday 24th of March 2011 04:53:28 AM
Old 03-24-2011
Question Matching A URL pattern

Code:
egrep -iow '(http[s]*[:][/]+|www[.])[^"\<>]*' url.txt

is this command logically incorrect to match a url pattern inside a file and display only the urls in the terminal???

Please rectify the error in my syntax ,

Last edited by Franklin52; 03-24-2011 at 08:17 AM.. Reason: Please use code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

comment/delete a particular pattern starting from second line of the matching pattern

Hi, I have file 1.txt with following entries as shown: 0152364|134444|10.20.30.40|015236433 0233654|122555|10.20.30.50|023365433 ** ** ** In file 2.txt I have the following entries as shown: 0152364|134444|10.20.30.40|015236433 0233654|122555|10.20.30.50|023365433... (4 Replies)
Discussion started by: imas
4 Replies

2. Shell Programming and Scripting

counting the lines matching a pattern, in between two pattern, and generate a tab

Hi all, I'm looking for some help. I have a file (very long) that is organized like below: >Cluster 0 0 283nt, >01_FRYJ6ZM12HMXZS... at +/99% 1 279nt, >01_FRYJ6ZM12HN12A... at +/99% 2 281nt, >01_FRYJ6ZM12HM4TS... at +/99% 3 283nt, >01_FRYJ6ZM12HM946... at +/99% 4 279nt,... (4 Replies)
Discussion started by: d.chauliac
4 Replies

3. Shell Programming and Scripting

sed - matching pattern one but not pattern two

All, I have the following file: -------------------------------------- # # /etc/pam.d/common-password - password-related modules common to all services # # This file is included from other service-specific PAM config files, # and should contain a list of modules that define the services... (2 Replies)
Discussion started by: RobertBerrie
2 Replies

4. Shell Programming and Scripting

pattern match url in string / PERL

Am trying to remove urls from text strings in PERL. I have the following but it does not seem to work: $remarks =~ s/www\.\s+\.com//gi; In English, I want to look for www. then I want to delete the www. and everything after it until I hit a space (but not including the space). It's not... (2 Replies)
Discussion started by: mrealty
2 Replies

5. UNIX for Dummies Questions & Answers

Find pattern suffix matching pattern

Hi, I am trying to get a result out of this but fails please help. Have two files /tmp/1 & /tmp/hosts. /tmp/1 IP=123.456.789.01 WAS_HOSTNAME=abcdefgh.was.tb.dsdc /tmp/hosts 123.456.789.01 I want this result in /tmp/hosts if hostname is already there dont want duplicate entry. ... (5 Replies)
Discussion started by: rajeshwebspere
5 Replies

6. Shell Programming and Scripting

Sed: printing lines AFTER pattern matching EXCLUDING the line containing the pattern

'Hi I'm using the following code to extract the lines(and redirect them to a txt file) after the pattern match. But the output is inclusive of the line with pattern match. Which option is to be used to exclude the line containing the pattern? sed -n '/Conn.*User/,$p' > consumers.txt (11 Replies)
Discussion started by: essem
11 Replies

7. Shell Programming and Scripting

PHP - Regex for matching string containing pattern but without pattern itself

The sample file: dept1: user1,user2,user3 dept2: user4,user5,user6 dept3: user7,user8,user9 I want to match by '/^dept2.*/' but don't want to have substring 'dept2:' in output. How to compose such regex? (8 Replies)
Discussion started by: urello
8 Replies

8. Shell Programming and Scripting

URL partial matching

I have two files: file 1 http://www.hello.com http://neo.com/peace/development.html, www.japan.com, http://example.com/abc/abc.html http://news.net http://lolz.com/country/list.html,www.telecom.net, www.highlands.net, www.software.com http://example2.com ... (1 Reply)
Discussion started by: csim_mohan
1 Replies

9. UNIX for Dummies Questions & Answers

Grep -v lines starting with pattern 1 and not matching pattern 2

Hi all! Thanks for taking the time to view this! I want to grep out all lines of a file that starts with pattern 1 but also does not match with the second pattern. Example: Drink a soda Eat a banana Eat multiple bananas Drink an apple juice Eat an apple Eat multiple apples I... (8 Replies)
Discussion started by: demmel
8 Replies

10. Shell Programming and Scripting

Big pattern file matching within another pattern file in awk or shell

Hi I need to do a patten match between files . I am new to shell scripting and have come up with this so far. It take 50 seconds to process files of 2mb size . I need to tune this code as file size will be around 50mb and need to save time. Main issue is that I need to search the pattern from... (2 Replies)
Discussion started by: nitin_daharwal
2 Replies
uri(n)						    Tcl Uniform Resource Identifier Management						    uri(n)

__________________________________________________________________________________________________________________________________________________

NAME
uri - URI utilities SYNOPSIS
package require Tcl 8.2 package require uri ?1.2.1? uri::split url ?defaultscheme? uri::join ?key value?... uri::resolve base url uri::isrelative url uri::geturl url ?options...? uri::canonicalize uri uri::register schemeList script _________________________________________________________________ DESCRIPTION
This package contains two parts. First it provides regular expressions for a number of url/uri schemes. Second it provides a number of com- mands for manipulating urls/uris and fetching data specified by them. For the latter this package analyses the requested url/uri and then dispatches it to the appropriate package (http, ftp, ...) for actual fetching. The package currently does not conform to RFC 2396 (http://www.rfc-editor.org/rfc/rfc2396.txt), but quite likely should be. Patches and other help are welcome. COMMANDS
uri::split url ?defaultscheme? uri::split takes an url, decodes it and then returns a list of key/value pairs suitable for array set containing the constituents of the url. If the scheme is missing from the url it defaults to the value of defaultscheme if it was specified, or http else. Cur- rently only the schemes http, ftp, mailto, urn, news, ldap and file are supported by the package itself. See section EXTENDING on how to expand that range. The set of constituents of an url (= the set of keys in the returned dictionary) is dependent on the scheme of the url. The only key which is therefore always present is scheme. For the following schemes the constituents and their keys are known: ftp user, pwd, host, port, path, type http(s) user, pwd, host, port, path, query, fragment. The fragment is optional. file path, host. The host is optional. mailto user, host. The host is optional. news Either message-id or newsgroup-name. uri::join ?key value?... uri::join takes a list of key/value pairs (generated by uri::split, for example) and returns the canonical url they represent. Cur- rently only the schemes http, ftp, mailto, urn, news, ldap and file are supported. See section EXTENDING on how to expand that range. uri::resolve base url uri::resolve resolves the specified url relative to base. In other words: A non-relative url is returned unchanged, whereas for a relative url the missing parts are taken from base and prepended to it. The result of this operation is returned. For an empty url the result is base. uri::isrelative url uri::isrelative determines whether the specified url is absolute or relative. uri::geturl url ?options...? uri::geturl decodes the specified url and then dispatches the request to the package appropriate for the scheme found in the url. The command assumes that the package to handle the given scheme either has the same name as the scheme itself (including possible capitalization) followed by ::geturl, or, in case of this failing, has the same name as the scheme itself (including possible capi- talization). It further assumes that whatever package was loaded provides a geturl-command in the namespace of the same name as the package itself. This command is called with the given url and all given options. Currently geturl does not handle any options itself. Note: file-urls are an exception to the rule described above. They are handled internally. It is not possible to specify results of the command. They depend on the geturl-command for the scheme the request was dispatched to. uri::canonicalize uri uri::canonicalize returns the canonical form of a URI. The canonical form of a URI is one where relative path specifications, ie. . and .., have been resolved. uri::register schemeList script uri::register registers the first element of schemeList as a new scheme and the remaining elements as aliases for this scheme. It creates the namespace for the scheme and executes the script in the new namespace. The script has to declare variables containing the regular expressions relevant to the scheme. At least the variable schemepart has to be declared as that one is used to extend the variables keeping track of the registered schemes. SCHEMES
In addition to the commands mentioned above this package provides regular expression to recognize urls for a number of url schemes. For each supported scheme a namespace of the same name as the scheme itself is provided inside of the namespace uri containing the variable url whose contents are a regular expression to recognize urls of that scheme. Additional variables may contain regular expressions for parts of urls for that scheme. The variable uri::schemes contains a list of all supported schemes. Currently these are ftp, ldap, file, http, gopher, mailto, news, wais and prospero. EXTENDING
Extending the range of schemes supported by uri::split and uri::join is easy because both commands do not handle the request by themselves but dispatch it to another command in the uri namespace using the scheme of the url as criterion. uri::split and uri::join call Split[string totitle <scheme>] and Join[string totitle <scheme>] respectively. CREDITS
Original code (regular expressions) by Andreas Kupries. Modularisation by Steve Ball, also the split/join/resolve functionality. BUGS, IDEAS, FEEDBACK This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category uri of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation. KEYWORDS
fetching information, file, ftp, gopher, http, ldap, mailto, news, prospero, rfc 2255, rfc 2396, uri, url, wais, www uri 1.2.1 uri(n)
All times are GMT -4. The time now is 07:57 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy