Replacing URL in a file with space Post: 302170772

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

replacing single space in argument

I want to write a script which will check the arguments and if there is a single space(if 2 more more space in a row , then do not touch), replace it with _ and then gather the argument so, program will be ran ./programname hi hello hi usa now hello hello so, inside of program,...

2. UNIX for Dummies Questions & Answers

replacing space with pipe(delimiter)

Hello All, I have a file with thousands of records: eg: |000222|123456987|||||||AARONSON| JOHN P|||PRIMARY |P |000111|567894521|||||||ATHENS| WILLIAM k|||AAAA|L Expected: |000222|123456987|||||||AARONSON| JOHN |P|||PRIMARY |P |000111|567894521|||||||ATHENS| WILLIAM |k|||AAAA|L I...

3. Shell Programming and Scripting

replacing all space seperates with tabs

hi, I have a file that is space separated at all columns. Basically what I want to do is replace all the space separations with column separations. Thanks kylle

4. UNIX for Advanced & Expert Users

sed help on replacing space before and after *

I would like to replace the value of * (which might have one or more whitespace(s) before and after *) using sed command in aix. Eg: Var='Hi I am there * Desired output: Hi I am there*

5. Shell Programming and Scripting

Suppressing space replacing by comma

hi i want to replace spaces by comma my file is ADD 16428 170 160 3 WNPG 204 941 No 204802 ADD 16428 170 160 3 WNPG 204 941 No 204803 ADD 16428 170 160 3 WNPG 204 941 No 204804 ADD...

6. Shell Programming and Scripting

Replacing a string with a space

I'm trying to replace a string "99999999'" with the blank where ever is there in the file. Could you please help in unix scripting. Thank You.

7. Shell Programming and Scripting

Replacing space with T only in the 1st line of the file

Hi Masters , I have a file whose header is like HDRCZECM8CZCM000000881 SVR00120100401160828+020020100401160828+0200CZK There is a space between 1 and S ,my req is to chng the space to T I tried echo `head -1 CDCZECM8CZCM000000881` | sed 's/ /T/' it works ,but how can I modify in...

8. Shell Programming and Scripting

Replacing / with a space using awk

I have a string and want to replace the / with a space. For example having "SP/FS/RP" I want to get "SP FS RP" However I am having problems using gsub set phases = `echo $Aphases | awk '{gsub(///," ")}; {print}'`

9. Web Development

Negate user space URL in Apache

Hello, I have a situation where I am trying to use Apache's RedirectMatch directive to redirect all users to a HTTPS URL except a single (Linux) user accessing there own webspace. I have found a piece of regular expression code that negates the username: ^((?!andy).)*$but when I try using it...

10. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use...

LEARN ABOUT REDHAT

www::robotrules

WWW::RobotRules(3)					User Contributed Perl Documentation					WWW::RobotRules(3)

NAME

       WWW::RobotsRules - Parse robots.txt files

SYNOPSIS

	require WWW::RobotRules;
	my $robotsrules = new WWW::RobotRules 'MOMspider/1.0';

	use LWP::Simple qw(get);

	$url = "http://some.place/robots.txt";
	my $robots_txt = get $url;
	$robotsrules->parse($url, $robots_txt);

	$url = "http://some.other.place/robots.txt";
	my $robots_txt = get $url;
	$robotsrules->parse($url, $robots_txt);

	# Now we are able to check if a URL is valid for those servers that
	# we have obtained and parsed "robots.txt" files for.
	if($robotsrules->allowed($url)) {
	    $c = get $url;
	    ...
	}

DESCRIPTION

       This module parses a /robots.txt file as specified in "A Standard for Robot Exclusion", described in
       <http://info.webcrawler.com/mak/projects/robots/norobots.html> Webmasters can use the /robots.txt file to disallow conforming robots access
       to parts of their web site.

       The parsed file is kept in the WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited.
       The same WWW::RobotRules object can parse multiple /robots.txt files.

       The following methods are provided:

       $rules = WWW::RobotRules->new($robot_name)
	   This is the constructor for WWW::RobotRules objects.  The first argument given to new() is the name of the robot.

       $rules->parse($robot_txt_url, $content, $fresh_until)
	   The parse() method takes as arguments the URL that was used to retrieve the /robots.txt file, and the contents of the file.

       $rules->allowed($uri)
	   Returns TRUE if this robot is allowed to retrieve this URL.

       $rules->agent([$name])
	   Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.

ROBOTS.TXT
       The format and semantics of the "/robots.txt" file are as follows (this is an edited abstract of
       <http://info.webcrawler.com/mak/projects/robots/norobots.html>):

       The file consists of one or more records separated by one or more blank lines. Each record contains lines of the form

	 <field-name>: <value>

       The field name is case insensitive.  Text after the '#' character on a line is ignored during parsing.  This is used for comments.  The
       following <field-names> can be used:

       User-Agent
	  The value of this field is the name of the robot the record is describing access policy for.	If more than one User-Agent field is
	  present the record describes an identical access policy for more than one robot. At least one field needs to be present per record.  If
	  the value is '*', the record describes the default access policy for any robot that has not not matched any of the other records.

       Disallow
	  The value of this field specifies a partial URL that is not to be visited. This can be a full path, or a partial path; any URL that
	  starts with this value will not be retrieved

ROBOTS.TXT EXAMPLES
       The following example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/" or "/tmp/":

	 User-agent: *
	 Disallow: /cyberworld/map/ # This is an infinite virtual URL space
	 Disallow: /tmp/ # these will soon disappear

       This example "/robots.txt" file specifies that no robots should visit any URL starting with "/cyberworld/map/", except the robot called
       "cybermapper":

	 User-agent: *
	 Disallow: /cyberworld/map/ # This is an infinite virtual URL space

	 # Cybermapper knows where to go.
	 User-agent: cybermapper
	 Disallow:

       This example indicates that no robots should visit this site further:

	 # go away
	 User-agent: *
	 Disallow: /

SEE ALSO

       LWP::RobotUA, WWW::RobotRules::AnyDBM_File

libwww-perl-5.65						    2001-04-20							WWW::RobotRules(3)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

replacing single space in argument

Discussion started by: convenientstore

2. UNIX for Dummies Questions & Answers

replacing space with pipe(delimiter)

Discussion started by: OSD

3. Shell Programming and Scripting

replacing all space seperates with tabs

Discussion started by: kylle345

4. UNIX for Advanced & Expert Users

sed help on replacing space before and after *

Discussion started by: techmoris