|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Python Newbie Question Regex
I starting teaching myself python and am stuck on trying to understand why I am not getting the output that I want. Long story short, I am using PDB for debugging and here my function in which I am having my issue: Code:
import re
...
...
...
def find_all_flvs(url):
soup = BeautifulSoup(urllib2.urlopen(url))
flvs = []
for link in soup.findAll(onclick=re.compile("doShowCHys=1*")):
link = str(link)
vidnum = re.search("\d{5,6}.*&", link)
vidurl = "http://www.blahblah.com/home/GetPlayerXML.aspx?lpk4=%s" % vidnum
for hashval_url in BeautifulSoup(urllib2.urlopen(vidurl)).findAll("flv"):
flvs.append(hashval_url.text)
return flvsI verified that my regex is correct(\d{5,6}.*&): Code:
"/home/Player.aspx?lpk4=108148&playChapter=True\',960,540,94343);return false;" produces: Code:
108148 which is what I want, so when running pdb using steps and I get to: Code:
vidnum = re.search("\d{5,6}.*&", link)and this is what I end up with as the output: Code:
<_sre.SRE_Match object at 0xaaf8de8> in which I should be seeing: Code:
108148 so it can be simply appended to: Code:
vidurl = "http://www.blahblah.com/home/GetPlayerXML.aspx?lpk4=%s" % vidnum producing: Code:
(pdb)p vidurl Code:
http://www.blahblah.com/home/GetPlay...px?lpk4=108148 I have been through several urls and cannot seem to figure out what I am doing wrong: Python Regular Expressions ?? ---------- Post updated at 04:37 PM ---------- Previous update was at 04:21 PM ---------- I made progress. The things you can find out by just reading:\ PHP Code:
Code:
vidnum = re.findall("\d{5,6}.*&", link)
(pdb)p vidum
['108148&']
(pdb)p vidurl
http://www.blahblah.com/home/GetPlay...px?lpk4=108148['108148&']How do I remove the brackets and single quotes to produce only: Code:
http://www.blahblah.com/home/GetPlay...px?lpk4=108148& ?? ---------- Post updated at 04:53 PM ---------- Previous update was at 04:37 PM ---------- It turned out the vidnum is part of a list and I needed to specify its place in the list, so: Code:
vidurl = "http://www.blahblah.com/home/GetPlayerXML.aspx?lpk4=%s" % vidnum[0] Last edited by metallica1973; 03-06-2013 at 05:39 PM.. |
| Sponsored Links | ||
|
|
#2
|
||||
|
||||
|
You could also try: Code:
refound = re.search('\d{5,6}(?=&)', link)
if refound:
vidurl = "http://www.blahblah.com/home/GetPlayerXML.aspx?lpk4=%s" % refound.group(0)Last edited by Chubler_XL; 03-06-2013 at 07:11 PM.. |
| Sponsored Links | ||
|
![]() |
| Tags |
| python, regex |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Perl newbie - regex replace all groups issue | samask | Shell Programming and Scripting | 3 | 12-28-2011 11:31 AM |
| Python Regex | barney34 | Shell Programming and Scripting | 1 | 07-21-2009 05:05 PM |
| UNIX newbie NEWBIE question! | Hanamachi | UNIX for Dummies Questions & Answers | 4 | 03-28-2009 04:10 PM |
| NEWBIE QUESTION: python 3 or 2.6.x | guptaxpn | Programming | 2 | 12-15-2008 11:04 PM |
| Newbie Regex Question | ciremg01 | UNIX for Dummies Questions & Answers | 0 | 11-30-2005 04:30 PM |
|
|