Script to extract forum posts


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script to extract forum posts
# 1  
Old 02-01-2011
Script to extract forum posts

What I have:

HTML Code:
thread_id=666&page=6666#666666">Post title 1</a><br><div style="padding:2px 0px 3px 0px;">Text from the post itself</div>

thread_id=666&page=6666#666666">Post title 2</a><br><div style="padding:2px 0px 3px 0px;">Text from the post itself</div>

thread_id=666&page=6666#666666">Post title 3</a><br><div style="padding:2px 0px 3px 0px;">Text from the post itself</div>
What I want as result:

Quote:
Post title 1

Text from the post itself

Post title 2

Text from the post itself

Post title 3

Text from the post itself
I assume there would be a quite easy way to do this with awk?
# 2  
Old 02-01-2011
One way:
Code:
perl -ne  'if(/>([^>]*)<.*div.*>([^>]*)</){print $1."\n\n".$2."\n\n";}' file

# 3  
Old 02-01-2011
With awk...

Code:
awk -F"[><]" 'NF{print $2"\n"$(NF-2)}' infile

# 4  
Old 02-01-2011
Or:
Code:
awk '{print $2"\n\n"$8"\n"}' FS='(>)|(<)' RS= file

# 5  
Old 02-01-2011
Thanks everyone. Smilie

Is there any way to apply this to a full html page, where there would be alot more than "what I have"? As in first find thread_id= and get that all the way to the fist </div>.
# 6  
Old 02-01-2011
Quote:
Originally Posted by KidCactus
Thanks everyone. Smilie

Is there any way to apply this to a full html page, where there would be alot more than "what I have"? As in first find thread_id= and get that all the way to the fist </div>.
How about if you show what you have with example?
# 7  
Old 02-01-2011
That would of course be a better idea.

I have this html file (attached to the post), and I want to cut out all text between:

thread_id=666&page=6666#666666">

and

</a>

And the between

<div style="padding:2px 0px 3px 0px;">

and

</div>

wherever that occurs in the html file.
Login or Register to Ask a Question

Previous Thread | Next Thread

5 More Discussions You Might Find Interesting

1. What is on Your Mind?

Mobile: Advanced Forum Statistics to Forum Home Page

For mobile users, I have just added a "first beta" Advanced Forum Statistics to the home page on mobile using CSS overflow:auto; so you can swipe if you need to see more. Google Search Console mobile usability says this page is "mobile friendly" so perhaps this will be useful for some of our... (12 Replies)
Discussion started by: Neo
12 Replies

2. What is on Your Mind?

Forum Update: Disabled Home Page Forum Statistics for Guests (Not Registered)

Just a quick update; to speed up the forums, I have disabled the forum statistics on the home page for non registered users. No changes for registered users. (0 Replies)
Discussion started by: Neo
0 Replies

3. UNIX for Dummies Questions & Answers

Script required (Example of a Bad Forum Subject)

A file contains the following data Name, Age, Sex, city, country abc, 20, m, tokyo, Japan def, 21, f, sydney, Australia ghd, 23, m, chicago, USA rww, 29, f, london, UK I need the city column to be replaced with XXX as follows Name, Age, Sex, city, country abc, 20, m, XXX, Japan... (8 Replies)
Discussion started by: vva
8 Replies

4. UNIX for Advanced & Expert Users

Help! SHELL or AWK script - only the masters of the forum will solve

Hello everybody! I have no experience with shell Programmer, but I need to compare 02 files. Txt and generate an output or a new file, after the comparisons. see: If the column 1 of file1 is equal to column 1 of file2, and column 3 of file2 contains the column 4 of file1, output: column1... (4 Replies)
Discussion started by: He2
4 Replies

5. Shell Programming and Scripting

Script to monitor forum

Hello. I am attempting to write a pretty complex script that monitors a forum and alerts me whenever a new post is made (this part of the script is done). I then want to have the script auto reply to the post with a predetermined message. The one catch here is this is a VERY popular forum. ... (0 Replies)
Discussion started by: yousillygoose
0 Replies
Login or Register to Ask a Question