find and replace urls


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers find and replace urls
# 1  
Old 10-07-2008
find and replace urls

I need to archive a large website onto a DVD. Many of the links and image srcs are absolute URLs. As I don't want to alter them all manually, I'm looking for a perl or unix command that would remove:

http://www.mydomain.com/mysubfolder/

and replace with:

./

Can anyone help me with this mission? I've tried various combinations of unix commands and perl, but without success...
# 2  
Old 10-08-2008
Code:
sed 's/.*\/mysubfolder\//\.\//g'

# 3  
Old 10-08-2008
To recurse the whole directory structure, you want find

Code:
find path/to/topleveldirectory -type f \
  -exec perl -pi -e s%http://www.mydomain.com/mysubfolder/%./%g {} \;

This will update the time stamp of all files, and needlessly rewrite those which don't contain any match. If that's a problem, perhaps grep for the string first, and only process the files which match.

Code:
find path/to/topleveldir -type f -exec sh -c '
    grep http://www.mydomain.com/mysubfolder/ {} >/dev/null &&
    perl -pi -e s%http://www.mydomain.com/mysubfolder/%./%g {}' \;

# 4  
Old 10-08-2008
wget has a mode to do this automagicly, if im not mistaken
actualy, a simple google search comes up with this
Downloading an Entire Web Site with wget | Linux Journal
 
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regex for URLs and files

Hi, I am looking for a regex that will validate a URL and files accessed in a browser. For example:http://www.google.co.uk http://www.google.com https://www.google.co.uk https://www.google.com ftp:// file:///somefile/on/a/server/accessed/from/browser/file.txt So far I have: ... (4 Replies)
Discussion started by: muay_tb
4 Replies

2. Shell Programming and Scripting

Replacing urls from file

Hi ALL, I have a file A which contains A=www.google.com B=www.abcd.com C=www.nick.com D=567 file B Contains A=www.google1234.com B=www.bacd.com C=www.mick.com D=789 I wanted a script which can replace file A contents with B Contents (5 Replies)
Discussion started by: nikhil jain
5 Replies

3. Post Here to Contact Site Administrators and Moderators

Not allowed to post URLs

Hi, I tried to post some perl code for discussion (wrapped in swaddling . However, a regex has an escaped backslash so the forum parser sees it as an URL? Had the same experience with the sample data that I tried to provide for the same discussion. It contains emails addresses,... (1 Reply)
Discussion started by: msutfin
1 Replies

4. Web Development

How to redirect URLs in Apache?

I am a total newbie to Apache. I need to do this only for this weekend during an upgrade from old system to new system We have different URLs http://domain.name/xxx (xxx varies to any length and words - it can be /home, /login, /home/daily, /daily/report, etc). How do i redirect all those to... (0 Replies)
Discussion started by: GosarJunk
0 Replies

5. Shell Programming and Scripting

Hashing URLs

So, I am writing a script that will read output from Bulk Extractor (which gathers data based on regular expressions). My script then reads the column that has the URL found, hashes it with MD5, then outputs the URL and hash to a file. Where I am stuck on is that I want to read the bulk... (7 Replies)
Discussion started by: twjolson
7 Replies

6. Homework & Coursework Questions

How is it possible to include URLs within the terminal?

I have noted that Oracle use some kind of hypermarking to create URLs within the terminal on Enterprise Linux. Any idea how to create a URL such as ..., which when right clicked opens a browser window? This supposed to be spam/advertisement? Got a PM from OP; it is not supposed to be spam... (1 Reply)
Discussion started by: jon80
1 Replies

7. Shell Programming and Scripting

find and replace

Hi, I have to grep value from one file, if that value is already present with "#" symbol. I have to remove that symbol in that file. Ex: file1.txt contains the following string #value=15 I have to search for "Value". If I found the string with hash symbol, nee to remove the # in... (6 Replies)
Discussion started by: ckchelladurai
6 Replies

8. Shell Programming and Scripting

find and replace

find . -type f -exec sed -i 's/000/333/' {} \; grep -A 1 'height' filename How to modify the command and replace only strings which is below the line which contains the string "height" If the line contains "000", then replace with '333'. (2 Replies)
Discussion started by: sandy1028
2 Replies

9. Shell Programming and Scripting

find and replace and keep

Hi All I've file in which has these lines in it create fil23456 read on 3345 create fil23456_1 read on 34567 create fil23456_2 read on 36789 I'm trying to replace the lines in such a way that in the end the file will look like create fil23456 read on 3345 alter fil23456 read on... (3 Replies)
Discussion started by: Celvin VK
3 Replies
Login or Register to Ask a Question