Python Newbie Question Regex | Unix Linux Forums | Shell Programming and Scripting

  Go Back    

Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Python Newbie Question Regex

Shell Programming and Scripting

python, regex

Closed Thread    
Thread Tools Search this Thread Display Modes
Old 03-06-2013
metallica1973 metallica1973 is offline
Registered User
Join Date: Dec 2007
Last Activity: 26 February 2015, 4:04 PM EST
Location: Washington D.C
Posts: 189
Thanks: 26
Thanked 2 Times in 2 Posts
Python Newbie Question Regex

I starting teaching myself python and am stuck on trying to understand why I am not getting the output that I want. Long story short, I am using PDB for debugging and here my function in which I am having my issue:

import re

def find_all_flvs(url):
    soup = BeautifulSoup(urllib2.urlopen(url))
    flvs = []
    for link in soup.findAll(onclick=re.compile("doShowCHys=1*")):
        link = str(link)
        vidnum   ="\d{5,6}.*&amp", link)
        vidurl   = "" % vidnum

        for hashval_url in BeautifulSoup(urllib2.urlopen(vidurl)).findAll("flv"):


    return flvs

I verified that my regex is correct(\d{5,6}.*&amp):

"/home/Player.aspx?lpk4=108148&playChapter=True\',960,540,94343);return false;"



which is what I want, so when running pdb using steps and I get to:

vidnum   ="\d{5,6}.*&amp", link)

and this is what I end up with as the output:

<_sre.SRE_Match object at 0xaaf8de8>

in which I should be seeing:


so it can be simply appended to:

vidurl   = "" % vidnum


(pdb)p vidurl


I have been through several urls and cannot seem to figure out what I am doing wrong:

Python Regular Expressions


---------- Post updated at 04:37 PM ---------- Previous update was at 04:21 PM ----------

I made progress. The things you can find out by just reading:\
PHP Code:

Scan through string looking for a location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the patternnote that this is different from finding a zero-length match at some point in the string.



all non-overlapping matches of pattern in string, as list of stringsThe string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return list of groupsthis will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match
I was simply using the wrong function. I replaced with re.findall and it worked partially.

vidnum   = re.findall("\d{5,6}.*&amp", link)
(pdb)p vidum
(pdb)p vidurl['108148&amp']

How do I remove the brackets and single quotes to produce only:



---------- Post updated at 04:53 PM ---------- Previous update was at 04:37 PM ----------

It turned out the vidnum is part of a list and I needed to specify its place in the list, so:

vidurl   = "" % vidnum[0]

Last edited by metallica1973; 03-06-2013 at 06:39 PM..
Sponsored Links
Old 03-06-2013
Chubler_XL's Avatar
Chubler_XL Chubler_XL is offline Forum Staff  
Join Date: Oct 2010
Last Activity: 22 February 2015, 9:28 PM EST
Posts: 3,104
Thanks: 119
Thanked 1,008 Times in 945 Posts
You could also try:

refound ='\d{5,6}(?=&amp)', link)

if refound:
    vidurl   = "" %

Last edited by Chubler_XL; 03-06-2013 at 08:11 PM..
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Perl newbie - regex replace all groups issue samask Shell Programming and Scripting 3 12-28-2011 12:31 PM
Python Regex barney34 Shell Programming and Scripting 1 07-21-2009 06:05 PM
UNIX newbie NEWBIE question! Hanamachi UNIX for Dummies Questions & Answers 4 03-28-2009 05:10 PM
NEWBIE QUESTION: python 3 or 2.6.x guptaxpn Programming 2 12-16-2008 12:04 AM
Newbie Regex Question ciremg01 UNIX for Dummies Questions & Answers 0 11-30-2005 05:30 PM

All times are GMT -4. The time now is 04:51 AM.