Noob question about parsing a website


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Noob question about parsing a website
# 1  
Old 03-02-2012
Noob question about parsing a website

I'm trying to parse the website, finance.yahoo.com/q?s=ge&ql=1, and retrieve the info between <span id="yfs_l84_ge">18.98</span>, so 18.98.

What would be the best way to go about this in a bash script?

Any help or suggestions will be much appreciated.
Thanks!

Last edited by mayson; 03-02-2012 at 07:50 PM..
# 2  
Old 03-03-2012
If bash script weren't a stringent requirement, I would suggest python, perl or php. Python has a package called BeautifulSoup to do exactly this... (you will need to also import urllib for fetching the page). Google-ing for these could give you some ready made scripts as well.

For bash script, you will need to use curl and awk (alternatively use wget and grep) but will be time consuming to get it done right (while soup.find("span", {'id' : "yfs_l84_ge"}) within your python parser would get you your required element).

Here's a quick non-debugged python code:

Code:
from BeautifulSoup import BeautifulSoup
import urllib
import urllib2

url = "<your website to crawl>"
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'

values = {'name' : 'Friendly Spider',
          'location' : 'New York, USA',
          'language' : 'Python'}

headers = {'User-Agent' : user_agent }
urlData = urllib.urlencode(values)
req = urllib2.Request(url, urlData, headers)
response = urllib2.urlopen(url)
the_page = response.read()
soup = BeautifulSoup(the_page)

span = soup.find("span", {'id' : "yfs_l84_ge"})
if span:
   print span.contents[0].strip()


Last edited by eosbuddy; 03-03-2012 at 01:52 AM.. Reason: add code
# 3  
Old 03-03-2012
Try awk:
Code:
awk -F\> '/^span id="yfs_l84_ge"/{print $2}' RS=\<

Code:
awk '$1==s{print $2}' RS=\< FS=\> s='span id="yfs_l84_ge"'

Code:
awk '$1=="span id=\"" i "\""{print $2}' RS=\< FS=\> i="yfs_l84_ge"


Last edited by Scrutinizer; 03-03-2012 at 06:45 AM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Getting the current time from a website and parsing date

I am trying to work on a script to grab the UTC time from a website So far I was able to cobble this together. curl -s --head web-url | grep ^Date: | sed 's/Date: //g' Which gives me the result I need. Wed, 06 Dec 2017 21:43:50 GMT What I need to is extract the 21:43:50 and convert... (4 Replies)
Discussion started by: allisterB
4 Replies

2. Shell Programming and Scripting

Noob Expect Scripting Question

I'm having some difficulty with convincing Expect to do what I need.. I have a loop that waits for input, a specific phrase of text followed by a single word. I need Expect to capture that word following the specific phrase. It should then store the word in a variable. I'm fairly sure it's... (6 Replies)
Discussion started by: LongLeafTea
6 Replies

3. Shell Programming and Scripting

For loop -- noob question

Hello, I am new to shell scripting and i am trying to figure why is this not working with else statement. I am searching for every directory in that DIR i am in, however the "else" seems to be triggered whenever the run the script.. Much thanks in advance! #!/bin/shell for item in... (3 Replies)
Discussion started by: Reb0rn
3 Replies

4. UNIX Desktop Questions & Answers

Question about a VIM script from a absolute noob

What is this script meant to do? How do I make it executable and how do i run it? I am trying to understand what the script is doing and how it is doing it... need a little advice. Here is is.. if then list=`w | grep $user | cut -c19-30` if then echo "The user $user is... (2 Replies)
Discussion started by: DanableLector
2 Replies

5. UNIX for Dummies Questions & Answers

perl array filling *NOOB question*

First time poster here and I'm pretty much a total noob with UNIX and Perl. So please bear with me. With Perl, I'm trying to fill an array with data that is in a CSV file. I would like to fill the array with only one of the columns in the CSV file. I have a file called data.csv: ... (2 Replies)
Discussion started by: WongSifu
2 Replies

6. Ubuntu

Simple Noob Question

I am editing the squid.confi on my server. I am done editing. How do I exit the confi file? Thank you. (2 Replies)
Discussion started by: sethartha
2 Replies

7. Shell Programming and Scripting

noob question about redirecting stderr

I dont know what I am doing wrong but I would like to redirect the stderr output to a file? the specific command is this time wget http://www.something.com/somefile.bin All I want to see is time's output which is stderr so I can see how long the file download took. I've tried redirecting... (2 Replies)
Discussion started by: trey85stang
2 Replies

8. UNIX for Dummies Questions & Answers

Noob question on comparing #'s.

I have a file with 3 digit numbers in it formatted as such: 123 065 321 How would I go about seeing if each number is less than 100 and if so outputting it to another file Yes, I am a bit of a noob. I have tried with grep but I don't think it'll work. Any general direction would be... (6 Replies)
Discussion started by: kirkm76
6 Replies

9. UNIX for Dummies Questions & Answers

Noob sorting question

Ok here is the deal, I have a command given to me by some systems guy who I cannot get ahold of on the weekend without paying him alot of money to help me. I need to get this done before Monday as I am just getting pummeled by DOS attacks. The comand given was.... netstat -ntu | awk '{print... (1 Reply)
Discussion started by: Hexabah
1 Replies

10. Programming

Question about compiling (noob)

I'm just getting started to lean C and I'm using Ubuntu today I found a tutorial at this site: http://einstein.drexel.edu/courses/CompPhys/General/C_basics/c_tutorial.html and I got an error after compiling the fist code: #include < stdio.h> void main() { printf("\nHello World\n"); } ... (9 Replies)
Discussion started by: arya6000
9 Replies
Login or Register to Ask a Question