Python BeautifulSoup Re Finding Digits Within Tags
I am writing a little python script that needs to grab version numbers between "<td>4.2.2</td>" within the tbody of the page:
Is it possible to use a one-liner to scrap only the digits between the tags:
"<td>4.2.2</td>"
so it spits out:
4.2.2
4.2.1
etc..
This is what I have done so far but dont understand why it creates the variable rpart as a ResultSet and a regular string that I can scrape the data.
Is this their a way to do this as a one-liner?
So what I am trying to do is:
1 - Search through the html page and capture on the first [tbody]....[/tbody], hence limit=1
2 - Regex through the results and only print out the digits that are inside the <td>\d*.\d*.\d*.\<td> tags
3 - Resulting in:
4.2.2
4.2.1
etc..
Last edited by metallica1973; 07-20-2015 at 04:58 PM..
The best thing about using a language like python is that you've ready-made parsers to make your life simpler.. and not resort to (cheaper?) techniques like regex (leave those things to perl :-D).
What you're trying to parse looks like a HTML file. Take a look at the HTMLParser module and see if you can cook something using that.
after putting a little elbow grease into this, I was able to accomplish what I needed to do with Beautiful and re:
my next question is how do I get rid of the None
---------- Post updated at 05:46 PM ---------- Previous update was at 02:39 PM ----------
blah=filter(None, blah)
Last edited by metallica1973; 07-23-2015 at 04:34 PM..
Hi All ,
I am having an input file as stated below
5728 U_TOP_LOGIC/U_CM0P/core/u_cortexm0plus/u_top/u_sys/u_core/r03_q_reg_20_/Q 011
611 U_TOP_LOGIC/U_CM0P/core/u_cortexm0plus/u_top/u_sys/u_core/r04_q_reg_20_/Q 011
3486... (4 Replies)
Hi all...
As you know I like making code backwards compatible for as many platforms as possible.
This Python script was in fact dedicated for the AMIGA A1200 using Pythons 1.4.0, 1.5.2, 1.6.0, 2.0.1, and 2.4.6 as that is all we have for varying levels of upgrades from a HDD and 4MB FastRam... (1 Reply)
I am working on requirement on spreadsheet in python scripting.
I have a spreadsheet containing cell values and with background color.
I am able to read the value value but unable to get the background color of that particular cell.
Actually my requirement is to read the cell value along... (1 Reply)
Hello all! I've looked all over the internet and this site and have come up a loss with an easy way to make a bash script to do what I want to do. I have a file with a naming convention as follows:
2012-01-18 string of words here 123.jpg
2012-01-18 string of words here 1234.jpg
2012-01-18... (2 Replies)
I have a list containing strings. All strings should have either "smp" or "drw" else it is considered an error. I have written this code below. Any better ideas to tackle this?
set fdrw = 0
set fsmp = 0
foreach f ($Lst)
set fdrwtag = `echo $f | awk '/drw/'`
set fsmptag = `echo $f | awk... (1 Reply)
I have a file with a lot of lines (a lot!) that contain 10 digits between double quotes. ie "1726937489". The digits are random throughout, but always contain ten digits.
I can not for the life of me, (via scouring the internet and grep how-to manuals) figure out how to find this when I search.... (3 Replies)
I have the following script and want to check if in each $f there exists either a "drw" or "smp" tag in the file name. How can I do it?
For example
npt06-32x24drw has the "drw" tag
npt06-32x24smp has the "smp" tag
npt06-32x24 no "drw" or "smp" tag found
#!/bin/csh
set iarg = 0... (0 Replies)
Hi Folks
Probably an easy one here but how do I get a sequence to get used as mentioned. For example in the following I want to automatically create files that have a 2 digit number at the end of their names:
m@pyhead:~$ for x in $(seq 00 10); do touch file_$x; done
m@pyhead:~$ ls file*... (2 Replies)