Noob, 1500 static pages to columns


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Noob, 1500 static pages to columns
# 1  
Old 04-26-2009
Noob, 1500 static pages to columns

Hi,

My name is Mike and I am new here, perhaps the "1 post" gave me away,

I have a small problem. I know some very minor UNIX and VI and study hard to get better, this time I took upon myself a task just right above my level,

I hope to get two things out of this post

1) some advise to do the first few steps better
2) a fix for my "I am stuck" part

OK,

This is what I did so far, I got a zip file with 1500 pages of static content with some minor HTML tags in it, they are divided by subject,

a.html has 200 links to 200 pages with a related information
b.html has 300 links to 145 pages with b related information
etc etc

Each page has a title which is the "keyword" and the content is the definition of this keyword, it used to be a glossary website I got donated.

What I want to do is add all these definitions into a glossary system on my vbulletin site,

POINTS
1) step 6 I am not sure how to put all the static pages into one document in the same order as the A.html file. I already made a column in excel (ms) and copied all keywords in, so next steps are to put the correct definition in the next column, I am sure I am being silly with my way here.. but ?

STEPS

1) uploaded all pages to my site
2) made A.html with relevant links for A topic
3) /test get file wget -i ./1 and now I have test filled with static pages for A topic
4) all pages have 6 lines of crap so, sed '1,6d' -i *.html
5) removing HTML tags sed -e :a -e 's/<[^>]*>//g;/</N;//ba'
6) my plan is to merge all the static pages into 1 file (not sure what the best way is because of POINT1 above) sed -i 1i"<%$%>" *.html (this places a unique string above each definition
7) Now put all the static pages into 1 page, with the string to seperate them, for i in 'cat *.html';do $i >> fileb; done

So now I have 1 page with all static docs for A topic, and between each definition is the <%$%> to show the next one.

TROUBLE

How do I know put all this into excel in columns,

keyword1 - definition
keyword2 - definition
keyword3 - definition
etc etc etc

As I said, the left column is already hand copied, although if this can be automated it would be awesome
# 2  
Old 04-26-2009
It will be a great if you can provide a sample of how your input file looks like and what output are you expecting...


cheers,
Devaraj Takhellambam
# 3  
Old 04-26-2009
Hi,

This is a sample static page

HTML Code:
<title>acquital</title>

<meta>acquital</meta>

<type>lawglossaryitem</type>



To be acquited of a crime is to be deemed to be innocent of
the charges after a court hearing. This is different from a
<a href=lawglos_Discharge.html>Discharge</a>, where the case is never heard. In
general, a defendant who is acquited can not be tried again
for the same offence. If more than one defendant is on
trial for the same offence (see: <a href=lawglos_Accomplice.html>Accomplice</a>),
the acquital of one of them is not admissible as evidence
in favour of the others. A conviction, however, is
admissible against the other defendants. This is because an
acquital is not `proof' of innocence, it is merely an
indication that the prosecution did not establish a case
strong enough for a conviction. In other words, `innocent
until proven guilty' does not mean that `all are innocent
until all are proven guilty'. 
<p>
<a href=lawglos_CriminalLaw.html>CriminalLaw</a>
This is the table for the content,

The output has to be like this in mysql


name : this is the title
description : the content
userid : 1
username : mike
dateline : 1240662719
lastupdate : 1240662719
categoryid : 1 (This is for <a href=lawglos_CriminalLaw.html>CriminalLaw</a> so category is 1, but there are multiple categories)
status : 1
ipaddress : 119.24*.18*.18*
attach : 0
threadid :0
lastupdater :
lastupdaterid :0
tags : keyword,Keyword (all possible combo's)
popup : content
views : 0
votenum : 0
votetotal : 0
# 4  
Old 04-27-2009
sorry to ask again,

But can someone please help me?
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Using larger than 120GB drives on a Blade 1500

Hi, I'm using Solaris 8 on a Blade 1500. I know that IDE drives are limited to around 120GB, but I was told that it's possible to use larger drives with the only caveat that the available size will be that 120 (or is it 128?) GB size. But when I try, format shows only very low sizes, like... (2 Replies)
Discussion started by: BobSol8
2 Replies

2. UNIX Benchmarks

Blade 1500 Silver

CPU/Speed: UltraSPARC-IIIi/1.5Ghz Ram: 1GB Motherboard: Sparc Bus: PCI Cache: Controller: Disk: ATA Load: 1 user Kernel: SunOS 5.10 Generic_137111-02 Kernel ELF?: yes pgms: gcc 2.95.3 compiled BYTE UNIX Benchmarks (Version 3.11) System -- SunOS aachen95 5.10... (0 Replies)
Discussion started by: MadeInGermany
0 Replies

3. Solaris

help installing solaris 10 on sun blade 1500

Hi everyone can someone please explain to me how to install solaris 10 on a sunblade 1500 using cdrom? Thanks for your assistance (1 Reply)
Discussion started by: cjashu
1 Replies

4. AIX

cp 1500+ Files

Hello, I need to copy 1500+ files (total of 170mb) from our /tmp/directory to a different directory /application/program/files I issue the command: user@host> cp * /application/program/files I am getting the follwing error: ksh: /usr/bin/cp: arg list too long being new to aix, I... (6 Replies)
Discussion started by: jeffs42885
6 Replies

5. Hardware

Nic Card for SunBlade 1500 Tower

Hi, I bought a server Sunblade 1500, and I would like to install a nic card. Anyone know any compatible network card for this server? Regards (1 Reply)
Discussion started by: ph0b0s
1 Replies

6. Programming

Even the Static cURL Library Isn't Static

I'm writing a program which uses curl to be run on Linux PCs which will be used by a number of different users. I cannot make the users all install curl on their individual machines, so I have tried to link curl in statically, rather than using libcurl.so. I downloaded the source and created a... (8 Replies)
Discussion started by: BrandonShw
8 Replies

7. IP Networking

I need HELP to Set up Coyote Linux router with 1 static IP & 64 internal static IP

hello, i need help on setting my coyote linux, i've working on this for last 5 days, can't get it to work. I've been posting this message to coyote forum, and other linux forum, but haven't get any answer yet. Hope someone here can help me...... please see my attached picture first. ... (0 Replies)
Discussion started by: dlwoaud
0 Replies

8. Solaris

Sunblade 1500 hba installation

Hi, I have put qlogic hba card and rebooted with -r option to configrue my hba card. How can I verify that it is intalled correctly? I have fiber connection going to storedge T3. Please help. (8 Replies)
Discussion started by: mokkan
8 Replies

9. UNIX for Advanced & Expert Users

SunBlade 1500 SCSI Floppy Jumpstart

Hello :) I have a customer who is OEMing their own hadware to us. The CPU is right out of a SunBlade 1500. Because of the physical requirements, the keyboard, DVD-ROM drive, 36GB Removable disk drive and scsi floppy are housed in what they call their media box which is in some cases 10 feet from... (0 Replies)
Discussion started by: rambo15
0 Replies

10. Solaris

Moving to Solaris 10 on a SunBlade 1500

I am using a SunBlade 1500 that currently runs Solaris 8 and I would like to install Solaris 10. This workstation has 1Gbyte of memory, 1 GigEthernet card and one 80Gbyte ATA drive. Soon a 2nd 80Gbyte drive will be installed. Are there any 'interesting' points I need to watch for in the... (9 Replies)
Discussion started by: miket
9 Replies
Login or Register to Ask a Question