Grabbing URL's from a website


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grabbing URL's from a website
# 1  
Old 12-15-2008
Grabbing URL's from a website

Hi everyone,

Need your help creating a script that downloads a website after entering the URL e.g. Google, and parses its contents for URL's and appends the output to a text file in this format:

http://grabbed-url-from-downloaded-website.com/
http://another-url-from-downloaded-website.com/

I was thinking maybe wget, and a script that greps for http* or www* and lists all it can find on a flatfile.

Thanks!
Ogoy
# 2  
Old 12-15-2008
Hi

Exactly. What you are trying to do is fine. Did you try any such script? I have done it couple of times, and it seems simple. Let me know if you need any help in parsing the contents.

~$u)hir
# 3  
Old 12-15-2008
You might want to try lwp-request. man lwp-request
# 4  
Old 12-15-2008
I've used the following with fairly good results:

links -dump "http://www.cnn.com" | egrep -o "http:.*"

I append the output > urlist.list however I am unsure how reliable this is. In other words yes, I do need help with parsing haha!

Thanks for the quick response guys Smilie

Ogoy
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Grabbing variabes from logs

Hey guys, I am working on a script that needs to grab variables from a log file. The script will run morning 9 to 5pm and save variables for each run every hour, these results I will be aggregating at the end of the day. I am thinking I will be placing the date on each entry in the log every... (7 Replies)
Discussion started by: mo_VERTICASQL
7 Replies

2. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies

3. Shell Programming and Scripting

Grabbing data from a secured website

Hello there, I am a beginner in Perl, and I have a challenging project: I have to create a program that checks regularly on an online bank account for new operations, it should then feed a database keeping track of all the money going in and out. Of course the login details of this online... (4 Replies)
Discussion started by: freddie50
4 Replies

4. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

5. Web Development

Regex to rewrite URL to another URL based on HTTP_HOST?

I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST The call goes out to http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena The ID in the middle is always random due to the cookie. I... (5 Replies)
Discussion started by: EXT3FSCK
5 Replies

6. Shell Programming and Scripting

Grabbing Certain Fields

ok, so a script i wrote spits out an output like the below: 2,JABABA,BV=114,CV=1,DF=-113,PCT=99.1228% as you can see, each field is separated by a comma. now, how can I get rid of the first field and ONLY show the rest of the fields. meaning, i want to get rid of the "2,", and... (3 Replies)
Discussion started by: SkySmart
3 Replies

7. UNIX for Dummies Questions & Answers

ReDirecting a URL to another URL - Linux

Hello, I need to redirect an existing URL, how can i do that? There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this? This is on Unix boxes Linux. example: https://m45.testing.address.net/host.php make it so the... (3 Replies)
Discussion started by: SkySmart
3 Replies

8. UNIX for Dummies Questions & Answers

Grabbing a value from an output file

I am executing a stored proc and sending the results in a log file. I then want to grab one result from the output parameters (bolded below, 2) so that I can store it in a variable which will then be called in another script. There are more details that get printed in the beginning of the log file,... (3 Replies)
Discussion started by: hern14
3 Replies

9. Shell Programming and Scripting

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies

10. UNIX for Advanced & Expert Users

url calling and parameter passing to url in script

Hi all, I need to write a unix script in which need to call a url. Then need to pass parameters to that url. please help. Regards, gander_ss (1 Reply)
Discussion started by: gander_ss
1 Replies
Login or Register to Ask a Question